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Preface 


The subject of this book is the solution of polynomial equations, that is, sys- 
tems of (generally) non-linear algebraic equations. This study is at the heart 
of several areas of mathematics and its applications. It has provided the mo- 
tivation for advances in different branches of mathematics such as algebra, 
geometry, topology, and numerical analysis. In recent years, an explosive de- 
velopment of algorithms and software has made it possible to solve many 
problems which had been intractable up to then and greatly expanded the 
areas of applications to include robotics, machine vision, signal processing, 
structural molecular biology, computer-aided design and geometric modelling, 
as well as certain areas of statistics, optimization and game theory, and bio- 
logical networks. At the same time, symbolic computation has proved to be 
an invaluable tool for experimentation and conjecture in pure mathematics. 
As a consequence, the interest in effective algebraic geometry and computer 
algebra has extended well beyond its original constituency of pure and applied 
mathematicians and computer scientists, to encompass many other scientists 
and engineers. While the core of the subject remains algebraic geometry, it 
also calls upon many other aspects of mathematics and theoretical computer 
science, ranging from numerical methods, differential equations and number 
theory to discrete geometry, combinatorics and complexity theory. 

The goal of this book is to provide a general introduction to modern math- 
ematical aspects in computing with multivariate polynomials and in solving 
algebraic systems. It is aimed to upper-level undergraduate and graduate stu- 
dents, and researchers in pure and applied mathematics and engineering, in- 
terested in computational algebra and in the connections between computer 
algebra and numerical mathematics. Most chapters assume a solid ground- 
ing in linear algebra while for several of them a basic knowledge of Grobner 
bases, at the level of [CLO97] is expected. Grébner bases have become a ba- 
sic standard tool in computer algebra and the reader may consult any other 
textbook such as [AL94, BW93, CLO98, GP02], or the introductory chapter 
in [CCS99]. Below we discuss briefly the content of each chapter and some of 
their prerequisites. 
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The book describes foundations, recent developments and applications of 
Grobner and border bases, residues, multivariate resultants, including toric 
elimination theory, primary decomposition of ideals, multivariate polynomial 
factorization, as well as homotopy continuation methods. While some of the 
chapters are introductory in nature, others present the state-of-the-art in sym- 
bolic techniques in polynomial system solving, including effective and algo- 
rithmic methods in algebraic geometry and computational algebra, complexity 
issues, and applications. We also discuss several numeric and symbolic-numeric 
methods. This is not a standard textbook in that each chapter is independent 
and, largely, self-contained. However, there are strong links between the dif- 
ferent chapters as evidenced by the many cross-references. While the reader 
gains the advantage of being able to access the book at many different places 
and of seeing the interplay of different views of the same concepts, we should 
note that, because of the different needs and traditions, some notations in- 
evitably vary between different chapters. We have tried to note this in the text 
whenever it occurs. The single bibliography and index underline the unity of 
the subject. 

The first chapter gives an introduction to the notions of residues and re- 
sultants, and the interplay between them, starting with the univariate case 
and synthesizing different approaches. The sections on univariate residues and 
resultants could be used in an undergraduate course on complex analysis, ab- 
stract algebra, or computational algebra as an introduction to more advanced 
topics and to illustrate the interdependence of different areas of mathematics. 
The multivariate sections, on the other hand, directed to graduate students 
and researchers, are intended as an introduction to concepts which are widely 
used in current research and applications. 

The second chapter puts the accent on linear algebra methods to deal 
with polynomial systems: the multiplication maps in the quotient algebra by 
a polynomial ideal are linear and allow for the use of eigenvalues and eigen- 
vectors, duality, etc. Applications to Galois theory, factoring, and primary 
decomposition are offered. The first sections require, besides standard linear 
algebra, some background on computational algebraic geometry (for instance, 
the first five chapters of [CLO97]). Some acquaintance with local rings (as in 
Chapter 4 of [CLO98]) would also be helpful. Known basic facts about field 
extensions and Galois theory are assumed in the last part. 

The third chapter also elaborates on the concepts in the first two chapters, 
and combines them with numerical methods for polynomial system solving 
and several applications. The tools and methods developed are used to solve 
problems arising in implicitization of rational surfaces, determination of the 
position of a camera or a parallel robot, molecular conformations, and blind 
identification in signal processing. The required background is very similar to 
that needed for the first sections of Chapter 2. 

Chapter 4 is devoted to laying the algebraic foundations for border bases of 
ideals, an extension of the theory of Grobner bases, yielding more flexible bases 
of the quotient algebras. Border bases yield a connection between computer 
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algebra and numerical mathematics. An application to design of experiments 
in statistics is included. 

The fifth chapter concentrates on various techniques for computing pri- 
mary decomposition of ideals. This machinery is applied to study an interest- 
ing class of ideals coming from Bayesian networks, establishing an important 
link between algebraic geometry and the emerging field of algebraic statistics. 
Besides Grobner bases, the readers are expected to have a casual understand- 
ing of the algebra-geometry dictionary between ideals in polynomial rings 
and their zero set. Many propositions that can be found in the literature 
are stated without proof, but the chapter contains several accessible exercises 
dealing with the structure and decomposition of polynomial ideals. 

Chapter 6 studies the inherent complexity of polynomial system solving 
when working with the dense encoding of the input polynomials and under 
the model of straight-line programs, i.e., when polynomials are not given by 
their monomials but by evaluation programs. Being a brief survey of alge- 
braic complexity applied to computational algebraic geometry, there is not 
much background required, though knowledge of basic notions of algebraic 
geometry and commutative algebra would be helpful. The chapter is mostly 
self-contained; when necessary, basic bibliography supplements are indicated. 

Chapter 7 is devoted to the study of sparse systems of polynomial equa- 
tions, i.e., algebraic equations with a specific monomial structure, presenting 
a comprehensive state-of-the-art introduction to the field. Combinatorial and 
discrete geometry, together with matrices of special structure, are ingredients 
of the presentation of toric (or sparse) elimination theory. The chapter fo- 
cuses on applications to geometric modelling and computer-aided design. It 
also provides the tools for exploiting the structure of algebraic systems which 
may arise in different applications. Some basic knowledge of discrete geometry 
for polyhedral objects in arbitrary dimension is assumed. This chapter will 
be of particular interest to graduate students and researchers in theoretical 
computer science or applied mathematics wishing to combine discrete and 
algebraic geometry. 

Chapter 8 deals with numerical algebraic geometry, a term coined some 
years ago to describe a new field, which bears the same relation to algebraic 
geometry as numerical linear algebra does to linear algebra. Modern homotopy 
methods to describe solution components of polynomial systems are presented. 
The prerequisites include a basic course in numerical analysis, in particular 
Newton’s method for nonlinear systems. Because of the numerical flavor of 
the proposed methods, this chapter is expected to be particularly appealing 
to engineers. 

Lastly, Chapter 9 gives a complete overview of old and recent methods for 
the important problem of approximate factorization of a multivariate polyno- 
mial, in other words, the complex irreducible decomposition of a hypersurface. 
The main techniques rely on approximate numerical computation but the re- 
sults are exact and certified. It is addressed to students and researchers with 
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some basic knowledge of commutative algebra, algebraic numbers and _ holo- 
morphic functions of several variables. 

This book grew out of Course Notes prepared for the CIMPA Graduate 
School on Systems of Polynomial Equation that we organized in Buenos Aires, 
in July 2003. We take this opportunity to thank CIMPA for the funding and 
the academic support to carry out this activity. We are also grateful for the 
support from the following institutions: International Centre for Theoreti- 
cal Physics (ICTP, Italy), Consejo Nacional de Investigaciones Cientificas y 
Técnicas (CONICET, Argentina), Institut National de Recherche en Informa- 
tique et en Automatique (INRIA, France), PROSUL Programme from CNPq 
(Brazil), Délégation régionale de coopération Frangaise au Chili, and Univer- 
sidad de Buenos Aires (Argentina). We also thank ECOS-Sud, whose project 
AOQE02 between INRIA and Universidad de Buenos Aires provided the ini- 
tial framework for our collaboration. Special thanks go to Gregorio Malajovich 
and Alvaro Rittatore, who co-organized with us the I Latin American Work- 
shop on Polynomial Systems which followed the School. Finally, we would like 
to thank deeply all the speakers and all the participants. 


December 2004 Alicia Dickenstein and Ioannis Z. Emiris 
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Introduction to residues and resultants 


Eduardo Cattani *! and Alicia Dickenstein **? 


' Department of Mathematics and Statistics - University of Massachusetts, 
Amherst, MA 01003, USA, cattani@math.umass.edu 

? Departamento de Matematica - FCEyN - Universidad de Buenos Aires, Ciudad 
Universitaria - Pab. I - (1428) Buenos Aires, Argentina, alidick@dm.uba.ar 


Summary. This chapter is an expanded version of the lecture notes prepared by the 
second-named author for her introductory course at the CIMPA Graduate School 
on Systems of Polynomial Equations held in Buenos Aires, Argentina, in July 2003. 
We present an elementary introduction to residues and resultants and outline some 
of their multivariate generalizations. Throughout we emphasize the application of 
these ideas to polynomial system solving. 


1.0 Introduction 


This chapter is an introduction to the theory of residues and of resultants. 
These are very classical topics with a long and distinguished history. It is not 
our goal to present a full historical account of their development but rather 
to introduce the basic notions in the one-dimensional case, to discuss some of 
their applications -in particular, those related to polynomial system solving- 
and present their multivariate generalizations. We emphasize in particular 
the applications of residues to duality theory and the explicit computation of 
resultants which, in turn, results in the explicit elimination of variables. 

Most readers are probably familiar with the classical theory of local 
residues which was introduced by Augustin-Louis Cauchy in 1825 as a pow- 
erful tool for the computation of integrals and for the summation of infi- 
nite series. Perhaps less familiar is the fact that given a meromorphic form 
(H(z)/P(z))dz on the complex plane, its global residue, i.e. the sum of local 
residues at the zeros of P, defines an easily computable linear functional on 
the quotient algebra A := C[z]/(P(z)) whose properties encode many impor- 
tant features of this algebra. As in Chapters 2 and 3, it is through the study of 
this algebra, and its multivariate generalization, that we make the connection 
with the roots of the associated polynomial system. 


* Partially supported by NSF Grant DMS-0099707. 
** Partially supported by Action AOOE02 of the ECOS-SeTCIP French-Argentina 
bilateral collaboration, UBACYT X052 and ANPCYT 03-6568, Argentina. 
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The basic definitions and properties of the univariate residue are reviewed 
in Section 1.1 and we discuss some nice applications in Section 1.2. Although 
there are many different possible definitions of the residue, we have chosen 
to follow the classical integral approach for the definition of the local residue. 
Alternatively, one could define the global residue by its algebraic properties 
and use ring localization to define the local residue. We indicate how this is 
done in a particular case. 

In Section 1.5 we study multidimensional residues. Although, as coeffi- 
cients of certain Laurent expansions, they are already present in the work of 
Jacobi [Jac30], the first systematic treatment of bivariate residue integrals is 
the 1887 memoir of Poincaré [Poi87], more than 60 years after the introduction 
of univariate residues. He makes the very interesting observation that geome- 
ters were long stopped from extending the one-dimensional theory because of 
the lack of geometric intuition in 4 dimensions (referring to C?). The modern 
theory of residues and the duality in algebraic geometry is due to Leray and 
Grothendieck. There have been many developments since the late 70’s: in the 
algebro-geometric side with the work of Grothendieck (cf. [Har66]); in analytic 
geometry where we may mention the books by Griffiths and Harris [GH78] 
and Arnold, Varchenko and Gusein -Zadé [AGZV85]; in commutative algebra 
with the work of Scheja and Storch [SS75, SS79], Kunz [Kun86], and Lipman 
[Lip87]; and in the analytic side with the residual currents approach pioneered 
by Coleff and Herrera [CH78]. In the 90’s the possibility of implementing sym- 
bolic computations brought about another important expansion in the theory 
and computation of multidimensional residues and its applications to elimi- 
nation theory as pioneered by the Krasnoyarsk school [AY83, BKL98, Tsi92]. 
It would, of course, be impossible to fully present all these approaches to the 
theory of residues or to give a complete account of all of its applications. In- 
deed, even a rigorous definition of multivariate residues would take us very 
far afield. Instead we will attempt to give an intuitive idea of this notion, 
explain some of its consequences, and describe a few of its applications. In 
analogy with the one-variable case we will begin with an “integral” definition 
of local residue from which we will define the total residue as a sum of local 
ones. The reader who is not comfortable with integration of differential forms 
should not despair since, as in the univariate case, we soon show how one can 
give a purely algebraic definition of global, and then local, residues using Be- 
zoutians. We also touch upon the geometric definition of Arnold, Varchenko 
and Gusein-Zadé. 

In Sections 1.3 and 1.4 we discuss the definition and application of the 
univariate resultant. This is, again, a very classical concept which goes back 
to the work of Euler, Bézout, Sylvester and Cayley. It was directly motivated 
by the problem of elimination of variables in systems of polynomial equa- 
tions. While the idea behind the notion of the resultant is very simple, its 
computation leads to very interesting problems such as the search for deter- 
minantal formulas. We recall the classical Sylvester and Bezoutian matrices 
in Section 1.4. 
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The rebirth of the classical theory of elimination in the last decade owes 
much to the work of Jouanolou [Jou79, Jou91, Jou97] and of Gelfand, Kapra- 
nov and Zelevinsky [GKZ94], as well as to the possibility of using resultants 
not only as a computational tool to solve polynomial systems but also to study 
their complexity aspects. In particular, homogeneous and multi-homogeneous 
resultants are essential tools in the implicitization of surfaces. We discuss the 
basic constructions and properties in Section 1.6. We refer to [Stu93, Stu98}, 
[Stu02, Ch. 4] and to Chapters 2, 3, and 7 in this book for further background 
and applications. A new theoretical tool in elimination theory yet to be fully 
explored is the use of exterior algebra methods in commutative algebra (start- 
ing with Eisenbud and Schreyer [ESW03] and Khetan [Khe03, Khel). 

In the last section of this chapter we recall how the resultant appears 
naturally as the denominator of the residue and apply this to obtain a normal 
form algorithm for the computation of resultants which, as far as we know, 
has not been noted before. 

Although many of the results in this chapter, including those in the last 
section, are valid in much greater generality, we have chosen to restrict most 
of the exposition to the affine and projective cases. We have tried to direct 
the reader to the appropriate references. 

For further reading we refer to a number of excellent books on the topics 
treated here: [AY83, AGZV85, CLO98, GKZ94, GH78, EM, Tsi92]. 


1.1 Residues in one variable 


1.1.1 Local analytic residue 


We recall that, given a holomorphic function h(z) with an isolated singularity 
at a point € in C, we may consider its Laurent expansion 


where h is holomorphic in a neighborhood of €, and define the residue of h at 
€as 
res¢e(h) = by. (1.1) 


The classical Residue Theorem tells us that the residue is “what remains after 
integrating” the differential form (1/27i) h(z) dz on a small circle around €. 
Precisely: 


1 
rese(h) = | h(z)dz, 
; 2mi J\2-€|=6 
for any sufficiently small positive 6 and where the circle {|z — €| = 6} is 


oriented counter-clockwise. 
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Remark 1.1.1. As defined in (1.1), the residue depends on the choice of lo- 
cal coordinate z. Associating the residue to the meromorphic 1-form h(z) dz 
makes it invariant under local change of coordinates. We will, however, main- 
tain the classical notation, res¢(h) rather than write res¢(h(z)dz). 


We can also think of the residue of a holomorphic function h at € as a 
linear operator res¢[h] : O¢ — C, which assigns to any holomorphic function 
f defined near € the complex number 


rese[h](f) := rese(f - A). 
Suppose h has a pole at € of order m, Then, the action of rese[h] maps 


lr dj 
z-Er> bo 


(2 —2)™1 1 bn 


and for any k > m, (z — €)* + 0 since (z — €)*- h is holomorphic at €. These 
values suffice to characterize the residue map res¢|[h] in this case: indeed, given 
f holomorphic near £, we write 


m—-1 (j) 
1Wy= Ou _9) 4 O42), 


=a 


with g holomorphic in a neighborhood of €. Therefore 


m—1 


(A) ( ; m-1 ’ ; 
reselh oe DreselA(z- 9) = FO.) 


Note, in particular, that the residue map res¢|h] is then the evaluation at € of 
a constant coefficient differential operator and that it carries the information 
of the principal part of h at €. 


1.1.2 Residues associated to polynomials 


In this notes we will be interested in the algebraic and computational aspects 
of residues and therefore we shall restrict ourselves to the case when h(z) 
is a rational function h(z) = H(z)/P(z), H,P € C[z]. Clearly, rese(h) = 0 
unless P(€) = 0. It is straightforward to check the following basic properties 
of residues: 


e If €isa simple zero of P, then 


rese (4) eS (1.3) 
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e If€isa root of P of multiplicity m, then 


res¢ (=o) =m- H(E). (1.4) 


Since (P’(z)/P(z))dz = d(In P(z)) wherever a logarithm In P of P is de- 
fined, the expression above is often called the (local) logarithmic residue. 


Given a polynomial P € C[z], its polar set Zp := {€ € C: P(é) = 0} is 
finite and we can consider the total sum of local residues 


res (5) = » rese(H/P), 


where H € C[z]. We will be particularly interested in the global residue oper- 
ator. 


Definition 1.1.2. The global residue resp : C[z] — C is the sum of local 
residues: 
resp(H) = S~ rese(H/P) 
€EZp 


Remark 1.1.3. We may define the sum of local residues over the zero set of P 
for any rational function h which is regular on Zp. Moreover, if we write h = 
H/Q, with Zp NM Zg = 0, then by the Nullstellensatz, there exist polynomials 
R,S such that 1 = RP+ SQ. It follows that the total sum of local residues 


S © rese(h/P) = resp(H3), 
€€Zp 
coincides with the global residue of the polynomial HS. 


Let R > 0 be large enough so that Zp be contained in the open disk 
{|z| < R}. Then, for any polynomial H the rational function h = H/P is 
holomorphic for |z| > R and has a Laurent expansion )°,,¢7 nz” valid for 
|z| > R. The residue of h at infinity is defined as 


T€Soo(h) := —e_1. (1.5) 
Note that integrating term by term the Laurent expansion, we get 


1 
TCSoo(h) = i h(z)dz. 


Since by the Residue Theorem, 


1 H(z) 
resp(H) = = Plz) dz, 


we easily deduce 
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Proposition 1.1.4. Let P,H € C[z]. Then resp(H) = —res.(H/P). 


Remark 1.1.5. We note that the choice of sign in (1.5) is consistent with Re- 
mark 1.1.1: If h = H/P is holomorphic for |z| > R, then we may regard h 
as being holomorphic in a punctured neighborhood of the point at infinity in 
the Riemann sphere S$? = C U {oo}. Taking w = 1/z as local coordinate at 
infinity we have: h(z)dz = —(h(1/w)/w?)dw and 


reso(—(h(1/w)/w2)) = —e-1. (1.6) 


Note also that Proposition 1.1.4 means that the sum of the local residues of 
the extension of the meromorphic form (H(z)/P(z)) dz to the Riemann sphere 
is zero. 


Proposition 1.1.6. Given P, H © C[z], resp(H) is linear in H and is a 
rational function of the coefficients of P with coefficients in Q. 

Proof. The first statement follows from the definition of resp(H) and the 
linearity of the local residue. Thus, in order to prove the second statement 
it suffices to consider resp(z*), k € N. Let d = deg P, P(z) = ar a;z4, 
aq # 0. Then, if follows from Proposition 1.1.4 and (1.6) that 


resp(2") = reso (aaa) —— (aR): 


where P;(w) = an ajw*/, Note that P,;(0) = aq 4 0 and therefore 
1/P,(w) is holomorphic near 0. Hence 


; 0 ifk+2-—d<0 
| "Ns : Li. 
naele) bA(A)O ite k+1-d>0 eae 
Now, writing P; = ag(1+ SS me); the expression ae (+) (0) may 


be computed as the w* coefficient of the geometric series 


r 


25 was (1.8) 
a 


and the result follows. 


In fact, we can extract from (1.7) and (1.8) the following more precise 
dependence of the global residue on the coefficients of P. 


Corollary 1.1.7. Given a polynomial P = pear a;zI € C[z] of degree d and 
k >d—1, there exists a polynomial with integer coefficients Cy such that 


resp(z*) = 


In particular, when P,H have coefficients in a subfield k, it holds that 
res p(H) ek. 
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We also deduce from (1.7) a very important vanishing result: 
Theorem 1.1.8. (Euler-Jacobi vanishing conditions) Given polynomials 
P,H € C[z] satisfying deg(H) < deg(P) —2, the global residue 
resp(H) = 0. 
We note that, in view of (1.3), when all the roots of P are simple, Theo- 


rem 1.1.8 reduces to the following algebraic statement: For every polynomial 
H € C[z], with deg H < deg P—1, 


A(g) _ 
» Pie) = 0. (1.9) 


The following direct proof of this statement was suggested to us by Askold 
Khovanskii. Let d = deg(P), Zp = {&1,...,€a}, and P(z) = aa]]%_,(z — &). 
Let L; be the Lagrange interpolating polynomial 

TTjzil2 — &) 

[Lj ail& — §3) , 

For any polynomial H with deg(H) < d—1, 


d 


H(z) = > HE) Lj (z). 


i=1 


So, if deg(H) < d—1, the coefficient of z4~! in this sum should be 0. But this 
coefficient is precisely 


x 


on 1 _ @ 
dH) eae) 4 Pe) 


i=1 


me) 


Since ag # 0, statement (1.9) follows. 


Since, clearly, resp(G.P) = 0, for all G € C[z], the global residue map resp 
descends to A := C[z]/(P), the quotient algebra by the ideal generated by P. 
On the other hand, if deg P = d, then A is a finite dimensional C-vector space 
of dimension deg(P), and a basis is given by the classes of 1,z,...,2¢~1. As 
in 2 we will denote by [H] the class of H in the quotient A. It follows from 
(1.7) and (1.8) that, as a linear map, 


resp: A—C 
is particularly simple: 


. 0 if0<k<d-2, 
res p([z*]) = { 1 vhod-L (1.10) 


The above observations suggest the following “normal form algorithm” for 
the computation of the global residue resp(H) for any H € C{z]: 
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1) Compute the remainder r(z) = rq_1z4-!+---+1r12+79 in the Euclidean 
division of H by P = agz4+--++ ap. 
2) Then, resp(H) = “+. 


ad 


We may also use (1.10) to reverse the local-global direction in the defi- 
nition of the residue obtaining, in the process, an algebraic definition which 
extends to polynomials with coefficients in an arbitrary algebraically-closed 
field IK of characteristic zero. We illustrate this construction in the case of a 
polynomial P(z) = ee a;z) € K[z] with simple zeros. Define a linear map 
L: K{z]/(P) — K as in (1.10). Let Zp = {&1,...,€a} C K be the zeros of P 
and [,,...,Lq be the interpolating polynomials. For any H € K{z] we set: 


rese,(H/P) := L({H.Li]). 


One can then check that the defining property (1.3) is satisfied. We will discuss 
another algebraic definition of the univariate residue in Section 1.2.1 and 
we will discuss the general passage from the global to the local residue in 
Section 1.5.3. We conclude this section by remarking on another consequence 
of Theorem 1.1.8. Suppose P,, P2 € C[z] are such that their set of zeros Z,, 
Z2 are disjoint. Then, for any H € C[z] such that 


deg H < deg P,; + deg P, — 2 


ss res = = 0 
os) a 


EEZ1UZ2 


we have that 


and, therefore 


resp,(H/P2) = S~ rese ( a! =— 5S rese (=) = —resp,(H/P,). 


EZ, €€Z2 
(1.11) 


We denote the common value by res, p, p,}(H). Note that it is skew-symmetric 
on P,, Pz. This is the simplest manifestation of a toric residue ({[Cox96, 
CCD97]). We will discuss a multivariate generalization in Section 1.5.6. 


1.2 Some applications of residues 


1.2.1 Duality and Bezoutian 


The global residue may be used to define a dualizing form in the algebra A. 
We give, first of all, a proof of this result based on the local properties of 
the residue and, after defining the notion of the Bezoutian, we will give an 
algebraic construction of the dual basis. 
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Theorem 1.2.1. For P € C[z], let A=C[z]/(P). The pairing Ax AC 
([Hi], [Hf2]) > resp(Hy - Hp) 
is non degenerate, 1.€. 
resp(H,-H2)=0 forall Hp if and only if Hy € (P). 


Proof. Let d = deg P and denote by &),...,&, the roots of P, with respective 
multiplicities m,,...,m,. Assume, for simplicity, that P is monic. Suppose 
resp(H, - Hz) = 0 for all Hy. Given i = 1...,7r, let G; = Hjzi(2 — &))™. 
Then, for any 0 < mi, 


0 = resp « (2 ~ &:)'Gs) = nese, (Hi/(@ — &)™~9) 


which, in view of (1.1.1), implies that (z—§&;)’ divides Hy. Since these factors 
of P are pairwise coprime, it follows that H, € (P), as desired. 


As before, we denote by K an algebraically-closed field of characteristic 
zero. 


Definition 1.2.2. Let P € K[z] be a polynomial of degree d. The Bezoutian 
associated to P is the bivariate polynomial 


d-1 
P(z)-P 
Ap(z,w) := (2) (w) = Aj(z)w' € Riz, vu]. 
Z-—w = 
Proposition 1.2.3. The classes [Ao(z)],...,[Aa—1(z)] € A = K[z]/(P) give 
the dual basis of the standard basis [1], |z],...,[z¢~1], relative to the non- 


degenerate pairing defined by the global residue. 


Proof. We note, first of all, that 


d 


d—1 
P(z) — P(w) = (< a.(on' (z—w) = S0(zAi(2) — Aj_-1(z)) w', 
1=0 


1=0 


where it is understood that A_;(z) = Ag(z) = 0. Writing P(w) = 4, aiw' 
and comparing coefficients we get the following recursive definition of A;(z): 


zAj(z) = Ai-1(2) — ai, (1.12) 
with initial step: zAo(z) = P(z) — ao. We now compute resp([z] - [A;(z)]). 
Since deg A; = d— 1 — i, deg(z7A,(z)) = d—1—i+ ]. Hence, if i > j, 
deg(z/ A;(z)) < d—2 and, by Theorem 1.1.8, 


resp(([z/] -[Aj(z)]) = Ofor i>j. 
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If i = j, then deg(z?A;) = d—1 and it is easy to check from (1.12) that its 
leading coefficient is ag, the leading coefficient of P. Hence 


resp([z4] - [A;(z)]) = resp(agz4*) = 1. 
Finally, we consider the case i < j. The relations (1.12) give: 
2A (z) = 2-12zA,(z) = 29-1(Aj_1(z) — a) 


and, therefore 

resp(z A;(z)) = resp(z7~'Aj_1(z)) 
given that resp(a;z7~') = 0 since 7 — 1 < d— 2. Continuing in this manner 
we obtain 


resp(z/A;(z)) = +--+ = resp(z3~*Ao(z)) = resp(z?~* 1 P(z)) = 0. 


Remark 1.2.4. Note that Proposition 1.2.3 provides an algebraic proof of The- 
orem 1.2.1. Indeed, we have shown that Theorem 1.2.1 only depends on the 
conditions (1.10) that we used in the algebraic characterization of the global 
residue. We may also use Proposition 1.2.3 to give an alternative algebraic 
definition of the global residue. Let &: A x A — A denote the bilinear sym- 
metric form defined by the requirement that ((z"],[A’]) = 4;;. Then, the 
global residue map res: A — K is defined as the unique linear map such that 


&(a, 2) = res(a- 3), for a, 8 € A. 


Remark 1.2.5. The recursive relations (1.12) are exactly those defining the 
classical Horner polynomials Hg_;(z) = aaz*~! + ag_12"-7 + +++ + Gain; 
associated to the polynomial P(z) = eae aj2. 


1.2.2 Interpolation 


Definition 1.2.6. Let Z := {&,...,& } C K be a finite set of points together 
with multiplicities m,,...,m, € N. Let d= m,+---+m, andh € Rlz]. A 
polynomial H € K{z] is said to interpolate h over Z if deg H < d—1 and 
HO (€,) = h@(€) for all j =1,...,m;—1. 

Proposition 1.2.7. Let Z C K and h € K[z] be as above. Let P(z) := 
Tf. (¢ — &)’""*. Then H interpolates h over Z if and only if [H| = [h] in 


A= Kl{z]/(P), ie. if H is the remainder of dividing h by P. 
Proof. If we write h = Q: P+ H, with deg H < d, then 


Jj 
h(E) = Soe QM()PO (&) + H(G), 
k=0 


for suitable coefficients c, € K. Since P (€;) = 0 for = 0,...,m; — 1, it 
follows that H interpolates h. On the other hand, it is easy to check that the 
interpolating polynomial is unique and the result follows. 
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Lemma 1.2.8. With notation as above, given h € K{z], the interpolating poly- 
nomial H of h over Z equals 


H(w) = 4 e(h)w’ wherec;(h) = resp(h - Aj). 


Proof. This is a straightforward consequence of the fact that res,(z?-A;(z)) = 
6;;. For the sake of completeness, we sketch a proof for the complex case using 
the integral representation of the residue. 

For any € > 0 and any w with |P(w)| < €, we have by the Cauchy integral 
formula 


1 h 1 h 
h(w) = => (2) dz = 7 peat ee Ap(z,w)dz. 
271 =-.2—-W 271 = - 
|P(z)|=e | P(z)|=e 


Denote I’ := {|P(z)| = €}; for any z € I’ we have the expansion 


1 ot i 2 P(w)" 
P@)—Pw) ~ PR1- BA ~ Pe 


which is uniformly convergent over I’. Then, 


h(w) = > (= [| Sec) P(w)”, (1.13) 


n>0 


and so, isolating the first summand we get 
h(w) = resp(h(z) Ap(z, w)) + Q(w) P(w). (1.14) 


Finally, call H(w) := resp(h(z) Ap(z,w)). It is easy to check that H = 0 
or deg(H) < d-— 1, and by linearity of the residue operator, H(w) = 
yo ei(h) w!, as desired. 


1.2.3 Ideal membership 
Let again P(z) = ys a;z’ € C[z]. While in the univariate case is trivial, it is 
useful to observe that Theorem 1.2.1 allows us to derive a residual system of 
d linear equations in the coefficients of all polynomials H(z) = )7'", hz of 
degree less than or equal to m, whose vanishing is equivalent to the condition 
that H € (P). 

Such a system can be deduced from any basis B = {(%,...,@a—1} of 
A = C[z]/(P). We can choose for instance the canonical basis of monomials 
{[z],j = 0,...,d— 1}, or the dual basis {[Ax(z)],& = 0,...,d — 1}. Theo- 
rem 1.2.1 means that H € (P), i.e. [H] = 0 if and only if 
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resp([H]-8;) = }— hyresp([24]6;) = 0 Vi=0,...,d-1. 
j=0 


Suppose m > d, when B is the monomial basis, the first d x d minor of the 
dx m matrix of the system is triangular, while if B is the dual basis given by 
the Bezoutian, this minor is simply the identity. 

If H € (P), we can obtain the quotient Q(z) = H(z)/P(Z) € C[z] from 
equations (1.13), (1.14). Indeed, we have: 


Q(w) = Do res[P"*"](H(z) Ap(z,w))P(w)"™. 


n>1 


deg(H) +1 
By Theorem 1.1.8, the terms in this sum vanish when n > —= 


1.2.4 Partial fraction decomposition 


We recall the partial fraction decomposition of univariate rational functions. 
This is a very important classical result because of its usefulness in the com- 
putation of integrals of rational functions. 

Let P,H € K[z] with deg(H) + 1 < deg(P) = d. Let {&,...,&,} be the 
zeros of P and let mj,...,m, denote their multiplicities. Then the rational 
function H(z)/P(z) may be written as: 


Pig) ye (5 fee ea (1.15) 


i=l 


for appropriate constants A;; € K. 

There are, of course, many elementary proofs of this result. Here we would 
like to show how it follows from the Euler-Jacobi vanishing Theorem 1.1.8. 
The argument below also gives a simple formula for the coefficients in (1.15) 
when P has only simple zeros. 

For any z ¢ {&,...,&} we consider the auxiliary polynomial P,(w) = 
(z—w)P(w) € K[w]. Its zeros are €;, with multiplicity m;,i=1,...,r, and z 
with multiplicity one. On the other hand, deg H < deg P; — 2, and therefore 
Theorem 1.1.8 gives: 


0 = resp,(H) = ee HDS eg IP): 


i=l 


Since P; has a simple zero at z, we have res,(H/P,) = H(z)/Pi(z) = 
—H(z)/P(z) and, therefore 


Pay = tte (GaP) 
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In case P has simple zeros we have rese,(H/P,) = H(&)/P{(&) which gives: 


H(z) _ os (H(G)/P'()) 
Ci aC 


In the general case, it follows from (1.2) that 


rese, (H/P,) > ae wie w) (¢,) = pa a 


for suitable constants k; and aj. 
We leave it as an exercise for the reader to compute explicit formulas for 
the coefficients A;; in (1.15). 


1.2.5 Computation of traces and Newton sums 


Let P(z) = = ajz' € C[z] be a polynomial of degree d, {£1,...,&-} the set 


u 
of zeros of P, and m,,...,m, their multiplicities. As always, we denote by A 


the C-algebra A = C[z]/(P). We recall (cf. Theorem 2.1.4 in Chapter 2) that 
for any polynomial Q € C[z], the eigenvalues of the multiplication map 


Mg: AA; [H]>[Q-H] 


are the values Q(€;). In particular, using (1.4), the trace of Mg may be ex- 
pressed in terms of global residues: 


tr(Mg) = Dik &;) = Resp(Q- P’). 


Theorem 1.2.9. The pairing Ax A—C 


([91]; [g2]) > tr (Mg. 9.) = Resp(gi - go - P’) 


is non degenerate only when all zeros of P are simple. More generally, the 
trace tr (My,9,) =0 for all go if and only if 91(&) =0, for alli =1,...,r or, 


equivalently, if and only if gi € /(P). 


Proof. Fix g; € C[z]. As tr (Mg,9.) = resp(gi - P’ - ga), it follows from Theo- 
rem 1.2.1 that the trace of g; - gg vanishes for all gp if and only if g; P’ € (P). 
But this happens if and only if g; vanishes over Zp, since the multiplicity of 


P' at any zero p of P is one less than the multiplicity of P at p. 


As trp(Q) is linear in Q, all traces can be computed from those corre- 
sponding to the monomials z*; i.e. the power sums of the roots: 


= \oméf = resp(z*- P'(2)). 
i=1 
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It is well known that the S;’s are rational functions of the elementary sym- 
metric functions on the zeros of P, i.e. the coefficients of P, and conversely 
(up to the choice of ag). Indeed, the classical Newton identities give recursive 
relations to obtain one from the other. It is interesting to remark that not 
only the power sums 5S; can be expressed in terms of residues, but that we 
can also use residues to obtain the Newton identities. The proof below is 
an adaptation to the one-variable case of the approach followed by Aizenberg 
and Kytmanov {[AK81] to study the multivariate analogues. 


Lemma 1.2.10. (Newton identities) For all £=0,...,d—1, 
d 
(d—£)ag = — 5 a;85_2 (1.16) 
j>e 


Proof. The formula (1.16) follows from computing: 
P"(z) 
res (= P(2)) 3 LEN 


in two different ways: 


i) As res (=) = Hey) = lay. 


£ 


ii) Expanding it as a sum: 


Sone (2) ony) + Some ( 


j<e joe 


The terms in the first sum vanish by Theorem 1.1.8 since deg(z‘~J P(z)) > 
deg (P'(z)) +2, while the second sum may be expressed as } 7... 4;5;—¢. Since 
So = d, the identity (1.16) follows. ~ 


1.2.6 Counting integer points in lattice tetrahedra 


Let P C R” be a polytope with integral vertices and let P° denote its interior. 
For any t € N, call 


L(P,t):=#(t-P)AZ” ;) L(P®,t) :=#(t- P°)AZ", 


the number of the lattice points in the dilated polyhedron t - P and in its 
dilated interior. Ehrhart [Ehr67] proved that these are polynomial functions 
of degree n. They are known as the Ehrhart polynomials associated to P and 
P°. Moreover, he determined the two leading coefficients and the constant 
term in terms of the volume of the polytope, the normalized volume of its 
boundary and its Euler characteristic. The other coefficients are not as easily 
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accessible, and a method of computing these coefficients was unknown until 
quite recently (cf. [Bar94, KK93, Pom93]). There is a remarkable relation 
between these two polynomials, the Ehrhart-Macdonald reciprocity law: 


L(P°,t) = (-1)" L(P,t). 


In [Bec00], Matthias Beck shows how to express these polynomials in terms 
of (multidimensional) residues. In the particular case when P is a tetrahedron, 
this is just a rational one-dimensional residue. We illustrate Beck’s approach 
by sketching a proof of Ehrhart-Macdonald reciprocity in the case of a tetra- 
hedron. 


Fix a1,...,@, © N and consider the tetrahedron with vertices at the origin 
and at the points (0,...,a;,...,0): 


X= {(a1,.--,2n) € RSo : a 


Clearly, 


AP ss {Bika gtty) EG Roy 3 oe 


Let A:= Jj, ai, Ag c= Hize aj, k=1,...,n. Then, 


n 


L(Z,t) = #{me ZR : pis <t} 


k=1 


= #{meZ3, : Somer < tA} 


k=1 


= #{ME To: ys mpAy + Mni1 = tA}. 
ied 


So, we can interpret L(57,t) as the coefficient of z'4 in the series product: 
(Tp oP Ot eo oa 1 eee eee VC ee ad) 
i.e. as the coefficient of z'4 in the Taylor expansion at the origin of 


1 
(1 — z41)...(1-— z4~)(1 — z) 


Thus, 


L(S,t) 


I 


a (a =2) Tan = =*5) 
= 1tresp (os) 
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For t € Z, let us denote by f;(z) the rational function 
ZA 1 
z-(1—z)-]]j_,0. - 244) 


Note that for t > 0, reso(f;) = —1, while for t < 0, res..(f;) = 0. In particular, 
denoting by Z the set of non-zero, finite poles of f,, we have for t > 0: 


L(Z,t) = 1+ reso(f_e(z)) = 1- $0 rese(f_e(z)). (1.17) 


EZ 


filz) = 


Since L(2’,t) is a polynomial, this identity now holds for every t. 
Similarly, we compute that 


n 
L(=°,t) = #{meZrt : So mp Ar + Mnii = tA}. 
k=1 
That means that L(°,t) is the coefficient of w'4 in the series product: 
(wt 4+ wr 4.0.) 22. (wan t+ wnt (wt wt...) 


or, in terms of residues: 


wt... whe (wot — 1) ) 


L(&°,t) = reso (a —w1)...(1— w4)(1—w) 


The change of variables z = 1/w now yields 
24-1 
le} = I. n a 
Be = Xa (= 75), l=2 5) 


(—1)"resoc(fe(z))- (1.18) 


The Ehrhart-Macdonald reciprocity law now follows from comparing (1.17) 
and (1.18), and using the fact that for t > 0, reso(f;) = —1. 


I 


1.3 Resultants in one variable 


1.3.1 Definition 


Fix two natural numbers d,,d2 and consider generic univariate polynomials 
of these degrees and coefficients in a field k: 


dy dz 
Pa=> a2, O@)= > be": (1.19) 
i=0 i=0 
The system P(z) = Q(z) = 0 is, in general, overdetermined and has no 


solutions. The following result is classical: 
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Theorem 1.3.1. There exists a unique (up to sign) irreducible polynomial 


Resq, ,da(P, Q) = Resq, .d2(o, on ,&d, , bo, ue »9/Ddy) € Z\ao, pat 14d, 50, os v9 Daas 


called the resultant of P and Q, which verifies that for any specialization of the 
coefficients a;,b; ink with ag, 4 0,ba, #0, the resultant vanishes if and only 
if the polynomials P and Q have a common root in any algebraically closed 
field K containing k. 


Geometrically, the hypersurface {(a,b) € K“+#*? : Resa, a,(a,b) = 0} 
is the projection of the incidence variety {(a,b,z) € KN t®@+3 » 341) a;2! 
byw b;z’ = 0}; that is to say, the variable z is eliminated. Here, and in what 
follows, K denotes an algebraically closed field. 

A well known theorem of Sylvester allows us to compute the resultant as 
the determinant of a matrix of size dj + dz, whose entries are 0 or a coefficient 
of either P or Q. For instance, when d, = dz = 2, the resultant is the following 
polynomial in 6 variables (ao, a1, @2, bo, b1, b2): 


b3aa, _ 2b2aga2bo + a5bs _ by bga,a9 _ bi a1 agbo + azb?.a9 + boboaz 
and can be computed as the determinant of the 4 x 4 matrix: 


ao 0 bo 0 
ay ao b1 bo 
a2 a, bo by 
0 a2 0 bo 


Mo 2 = (1.20) 


Let us explain how one gets this result. The basic idea is to linearize the 
problem in order to use the eliminant polynomial par excellence: the deter- 
minant. Note that the determinant of a square homogeneous linear system 
A-« = O allows to eliminate x: the existence of a non trivial solution x 4 0 
of the system, is equivalent to the fact that the determinant of A (a polynomial 
in the entries of A) vanishes. 

Assume deg(P) = di, deg(Q) = dz. A first observation is that P and Q 
have acommon root if and only if they have a common factor of positive degree 
(since P(zo) = 0 if and only if z — zo divides P). Moreover, the existence of 
such a common factor is equivalent to the existence of polynomials g;, go with 
deg(g1) < dz — 1,deg(g2) < d, — 1, such that g;P + g2Q = 0. Denote by S¢ 
the space of polynomials of degree ¢ and consider the map 


Sady-1 x Sa,-1 —. Sd, +d—1 


1.21 
(91,92) t7+mP+g2Q oe 


This defines a K-linear map between two finite dimensional K-vector spaces of 
the same dimension d;+d2, which is surjective (and therefore an isomorphism) 
if and only if P and Q do not have any common root in K. Denote by Mg, ,a, 
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the matrix of this linear map in the monomial bases. It is called the Sylvester 
matriz associated to P and Q. Then 


Resa, ds (P,Q) =m det(Ma, ,d,)- (1.22) 


The sign in this last equality cannot be determined, but the positive sign is 
taken by convention. 

Note that for d; = dy = 2 we obtain the matrix Mg 9 in (1.20). The general 
shape of the Sylvester matrix is: 


ao bo 
a, ao by bo 
a1 by 
ao bo 
ay by 
Ad, bas 
ad, bay 
Gd, bas 


where the blank spaces are filled with zeros. 

Note that setting ag, = 0 but ba, 4 0, the determinant of the Sylvester 
matrix equals bg, times the determinant of the Sylvester matrix Mg,-1,4a, 
(in ao,---,@d;—1,00,---,a,). We deduce that when deg(P) = di, < d; and 
deg(Q) = dg, the restriction of the (d;, dz) resultant polynomial to the closed 
set (@aq, = +++ = Ga 41 = 0) of polynomials of degrees d‘,d2 factorizes as 


Resa, ds (P, Q) = i.e Resa dy (P, Q). 

What happens if we specialize both P and Q to polynomials of respective 
degrees smaller than d; and dz? Then, the last row of the Sylvester matrix is 
zero and so the resultant vanishes, but in principle P and @ do not need to 
have a common root in kK. One way to recover the equivalence between the 
vanishing of the resultant and the existence of a common root is the following. 

Given P,Q as in (1.19), consider the homogenizations P”,Q” defined by 


d, dz 
P*(z,w) = y ajz'w!?, Qh = y bjztw?*, 
i=0 i=0 


Then, P,Q can be recovered by evaluating at w = 1 and (zo,1) is a com- 
mon root of P”, Q" if and only if P(zo) = Q(zo) = 0. But also, on one hand 
P"(0,0) = Q"(0,0) = 0 for any choice of coefficients, and on the other P”, Q” 
have the common root (1,0) when ag, = bg, = 0. The space obtained as the 
classes of pairs (z,w) 4 (0,0) after identification of (z,w) with (Az, w) for 
any \ € K — {0}, denoted P!(K), is called the projective line over K. Since 
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for homogeneous polynomials as P” it holds that P’(\ z,\w) = A“ P"(z, w) 
(and similarly for Q”) , it makes sense to speak of their zeros in P!(K). So, we 
could restate Theorem 1.3.1 saying that for any specialization of the coeffi- 
cients of P and Q, the resultant vanishes if and only if their homogenizations 
have a common root in P!(IK). As we have already remarked, when K = C, the 
projective space P!(C) can be identified with the Riemann sphere, a compact- 
ification of the complex plane, where the class of the point (1,0) is identified 
with the point at infinity. 


1.3.2 Main properties 


It is interesting to realize that many properties of the resultant can be derived 
from its expression (1.22) as the determinant of the Sylvester matrix: 


i) The resultant Resq,¢, is homogeneous in the coefficients of P and Q sep- 
arately, with respective degrees dz,d,. So, the degree of the resultant in 
the coefficients of P is the number of roots of Q, and vice-versa. 

ii) The resultants Resg, ~, and Resy,q, coincide up to sign. 

iii) There exist polynomials A), Az € Zlao,...,ba,|[z] with deg(A1) = dz — 
1, deg(Az) = di — 1 such that 


Resq,,d(P, Q) =A,P+ AoQ. (1.23) 


Let us sketch the proof of property iii). If we add to the first row in the 
Sylvester matrix z times the second row, plus z? times the third row, and so 
on, the first row becomes 


P(z) zP(z) ... 28 —4P(z) Q(z) zQ(z) z@—'Q(z) 


but the determinant is unchanged. Expanding along this modified first row, 
we obtain the desired result. 

Another important classical property of the resultant Resg, 4, (P, Q) is that 
it can be written as a product over the zeros of P or Q: 


Proposition 1.3.2. (Poisson formula) Let P,Q polynomials with respec- 
tive degrees dy,dz and write P(z) = aa, [Tj-1(2—pi)™ , Q(z) = ba, [Gai (2- 
qj)". Then 


Resg, gl Q a7 Ta pi = (—1)9" 08 [| Pe)" 
j=l 
Proof. Again, one possible way of proving the Poisson formula is by showing 
that 
Resqy dp ((2 ~~ PFs Q) = Q(p)Resa,-1,d2 ean Q), 
using the expression of the resultant as the determinant of the Sylvester ma- 


trix, and standard properties of determinants. The proof would be completed 
by induction, and the homogeneity of the resultant. 
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Alternatively, one could observe that R’(a, b) := an TL~1 Q(pi)"™ depends 
polynomially on the coefficients of Q and, given the equalities 


Ri(@,b) = 0262 [[@-—qy"" = —)*? ee |] Pia”, 


ij j=l 


on the coefficients of P as well. Since the roots are unchanged by dilation of 
the coefficients, we see that, as Resa, a,, the polynomial R’ has degree d; + do 
in the coefficients (a,b) = (ao,...,ba,). Moreover, R’(a, b) = 0 if and only if 
there exists a common root, i.e. if and only if Resa, a. (a,b) = 0. This holds 
in principle over the open set (aa, 4 0, ba, #4 0) but this implies that the loci 
{R’ = 0} and {Resq, a, = 0} in K“ ++? agree. Then, the irreducibility of 
Resq, a, implies the existence of a constant c € K such that Resa, a, =c- R’. 
Evaluating at P(z) = 1,Q(z) = z®, the Sylvester matrix Mz, a, reduces to 
the identity Ig,+a, and we get c= 1. 


We immediately deduce 


Corollary 1.3.3. Assume P = P, - P2 with deg(P,) = d\, deg(P2) = da, and 
deg(Q) = dg. Then, 


Resq 4a4,do(P, Q) = Resy, a,(Pi, Q) Resa, a, (Pe, Q). 


There are other determinantal formulas to compute the resultant, coming 
from suitable generalizations of the map (1.21), which are for instance de- 
scribed in [DDO1]. In case d; = dy = 3, the Sylvester matrix M33 is 6 x 6. 
Denote [ij] := a:b; — a;b;, for all i,7 =0,...,3. The resultant Res3 3 can also 
be computed as the determinant of the following 3 x 3 matrix: 


[03] [02] (01] 
B33 := | [13] [03] + [12] [02] ] , (1.24) 
[23] [13] [03] 


or as minus the determinant of the 5 x 5 matrix 


ao 0 bo 0 [01] 
ay, ao by bo [02] 
ag ay, by by [03] 
a3 a2 bs b3 0 
0 a3 0 b3 0 


Let us explain how the matrix B33 was constructed and why Res33 = 
det(Bs3,3). We assume, more generally, that dj = dz = d. 


Definition 1.3.4. Let P,Q polynomials of degree d as in (1.19). The Be- 
zoutian polynomial associated to P and Q is the bivariate polynomial 
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P(}QW) — PUA) _F 6, ayy 


Ap glz,y)= z—-y 


i,j=0 


The dx d matrix Bp.g = (ci;) is called the Bezoutian matrix associated to P 


and Q. 


Note that Ap; = Ap defined in (1.2.2) and that each coefficient c;; is a 
linear combination with integer coefficients of the brackets [k, €] = axbe —avbg. 


Proposition 1.3.5. With the above notations, 
Resqa,a(a@, 6) = det(Bpg). (1.25) 


Proof. The argument is very similar to the one presented in the proof of 
Poisson’s formula. Call R’ := det(Bp.g). This is a homogeneous polynomial in 
the coefficients (a, b) of the same degree 2d = d+d as the resultant. Moreover, 
if Resg.a(a,b) = 0, there exists zo € K such that P(zo) = Q(zo) = 0, and so, 


Apg(y, 20) = ae ae cis28) y’ is the zero polynomial. This shows that 


R'(a,b) = 0 since the non trivial vector (1, 20,..., g-*) lies in the kernel of 
the Bezoutian matrix Bp.g. By Hilbert’s Nullstellensatz, the resultant divides 
a power of R’. Using the irreducibility of Resg,q plus a particular specialization 
to adjust the constant, we get the desired result. 


The Bezoutian matrices are more compact and practical experience seems 
to indicate that these matrices are numerically more stable than the Sylvester 
matrices. 


1.4 Some applications of resultants 


1.4.1 Systems of equations in two variables 


Suppose that we want to solve a polynomial system in two variables f(z, y) = 
g(z,y) = 0 with f,g € K[z,y]. We can “hide the variable y in the coefficients” 
and think of f,g € K[y][z]. Denote by d,,dz the respective degrees in the 
variable z. Then, the resultant Res, a,(f,g) with respect to the variable z 
will give us back a polynomial (with integer coefficients) in the coefficients, 
i.e. we will have a polynomial in y, which vanishes on every yo for which there 
exists zo with f(z0, yo) = g(Zo, yo) = 0. So, we can eliminate the variable z 
from the system, detect the second coordinates yo of the solutions, and then 
try to recover the full solutions (Zo, yo). 

Assume for instance that f(z, y) = z?+y?—10, g(z,y) = 27+2y?+zy-16. 
We write 


f(z,y) = 2? +0z+ (y?-10), g(z,y) = 2? + yz + (2y? — 16). 
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Then, Res22(f,g) equals 


Res2,2((1,0, y?—10), (1, y, 2y7—16)) = —22y°+2y*+36 = 2(y+3)(y—3)(y?—2). 


For each of the four roots yo = —3,3, /2,—V2, we replace g(z, yo) = 0 and 


we need to solve z = v8 Note that f(z, yo) = 0 will also be satisfied due to 
the vanishing of the resultant. So, there is precisely one solution z for each 
yo. The system has 4 = 2 x 2 real solutions. 

It is easy to deduce from the results and observations made in Section 1.3 
the following extension theorem. 


Theorem 1.4.1. Write f(z,y) = Lam ios ea) = = 3, gily)2*, with 
fig: € Kyl, and fa,,ga, non zero. Let yo be a root of the resultant with 
respect to z, ReSa,,do(f, 9) € Kly]. If either fa,(yo) 4 0 or ga.(yo) # 0, there 
exists z9 € K such that f(z0, yo) = g(Zo0, yo) = 0. 


Assume now that f(z, y) = yz—-1, 9(z, y) = y?—y. It is immediate to check 
that they have two common roots, namely {f = g = 0} = {(1,1), (-1,-1)}. 
Replace g by the polynomial g := g+ f. Then, {f = 9 =0} ={f =g =0} 
but now both f,g have positive degree 1 with respect to the variable z. The 
resultant with respect to z equals 


2 —l 
Res; 1(f,g) = det (? . 1) = y*(y? — 1). 


Since both leading coefficients with respect to z are equal to the polynomial 
y, Theorem 1.4.1 asserts that the two roots yo = +1 can be extended. On the 
contrary, the root yo = 0 cannot be extended. 

Consider now f(z,y) = yz? + 2-1, 9(z,y) = y® — y and let us again 
consider f and g := g+ f, which have positive degree 2 with respect to z. 
In this case, yo = 0 is a root of Res22(f,g) = y*(y? — 1). Again, yo = 0 
annihilates both leading coefficients with respect to z. But nevertheless it can 
be extended to the solution (0,1). 

So, two comments should be made. The first one is that finding roots 
of univariate polynomials is in general not an algorithmic task! One can try 
to detect the rational solutions or to approximate the roots numerically if 
working with polynomials with complex coefficients. The second one is that 
even if we can obtain the second coordinates explicitly, we have in general a 
sufficient but not necessary condition to ensure that a given partial solution 
yo can be extended to a solution (29, yo) of the system, and an ad hoc study 
may be needed. 


1.4.2 Implicit equations of curves 


Consider a parametric plane curve C given by z = f(t),y = g(t), where 
f.g € Kit], or more precisely, 
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C = {(z,y) €K?: z= f(t), y= g(t) forsomet € K}. 


Having this parametric expression allows one to “follow” or “travel along” 
the curve, but it is hard to detect if a given point in the plane is in C. One 
can instead find an implicit equation f € K[z, y], ie. a bivariate polynomial 
f such that C = {f = 0}. This amounts to eliminating ¢ from the equations 
z— f(t) = y-— g(t) = 0 and can thus be done by computing the resultant with 
respect to t of these polynomials. 

This task could also be solved by a Grobner basis computation. But 
we propose the reader to try in any computer algebra system the follow- 
ing example suggested to us by Ralf Froberg. Consider the curve C de- 
fined by z = #2,y = t#® — 196 — 169 — 462 — 463, Then the resultant 
Ressa,63(t32 — z, t48 — 156 — ¢6° — 26? — ¢63 — y) with respect to ¢t can be com- 
puted in a few seconds, giving the answer f(z,y) we are looking for. It is a 
polynomial of degree 63 in z and degree 32 in y with 257 terms. On the other 
side, a Grobner basis computation seems to be infeasible. 

For a plane curve C with a rational parametrization; i.e. 


C = {(pilt)/ar(t), pa(t)/ae(t)) = a(t) #0, a2(t) FO} , 


where p;,q; € K{t], the elimination ideal 
Th := (n(t)z— pilt), a2(t)y — pa(t)) A Klz, 9] 


defines the Zariski closure of C in K?. We can obtain a generator of I, with a 
resultant computation that eliminates t. For example, let 


t?—1 t+1 
eS (aes) tei, -12h. 
Then C = V(J,) is the zero locus of 


f(z,y) = Rese,o((1 + 2t)?z — (¢? — 1), (1 + 24)(1 - #))y — (+1) 


which equals 


27y?z — 18yz + 4y +42? — z. 


We leave it to the reader to verify that C is not Zariski closed. 

One could also try to implicitize non planar curves. We show a general 
classical trick in the case of the space curve C with parametrization « = 
t?,y = t?,z = t?. We have 3 polynomials x — t?,y — t?,z — t° from which 
we want to eliminate t. Add two new indeterminates u,v and compute the 
resultant 


Res2,5 (a uly #*) Fu(z Py) = (—y?+23)u?+(204 2yz)uv+( 274 x°)v. 


Then, since the resultant must vanish for all specializations of u and v, we 
deduce that 


C = {-y? +29 = 24 — Qyz = —27 4+. 2° =0}. 
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1.4.3 Bézout’s theorem in two variables 


Similarly to the construction of P'(IK), one can define the projective plane 
P?(K) (and in general projective n-space) as the complete variety whose points 
are identified with lines through the origin in K?. We may embed K? in P?(K) 
as the set of lines through the points (x, y, 1). Again, it makes sense to speak of 
the zero set in P?(IK) of homogeneous polynomials (i.e. polynomials f(a, y, z) 
such that f(Azx, Ay, Az) = A“ f(a, y, z), for d = deg(f)). 

Given two homogeneous polynomials f,g € K{z,y,z] without common 
factors, with deg(f) = di, deg(g) = dz, a classical theorem of Bézout asserts 
that they have d; - d2 common points of intersection in P?(K), counted with 
appropriate intersection multiplicities. A proof of this theorem using resultants 
is given for instance in [CLO97]. The following weaker version suffices to obtain 
such nice consequences as Pascal’s Mystic Hexagon theorem [CLO97, Sect. 8.7] 
(see Corollary 1.5.15 for a proof using multivariable residues). 


Theorem 1.4.2. Let f,g € K[z,y,z] be homogeneous polynomials, without 
common factors, and of respective degrees d,,d2. Then (f = 0)N(g = 0) is 
finite and has at most d, - dz points. 


Proof. Assume (f = 0) (g = 0) have more than d, - dz points, which we 
label po,.-.,Pdyd,. Let Li; be the line through p; and p; for i,j =0,...,dido. 
Making a linear change of coordinates, we can assume that (0,0,1) ¢ (f = 
0) U (g = 0) U (Uj;Lij). Write f = TS aiz?, g = Siar bj 27, as polynomi- 
als in z with coefficients a;,b; € K[z,y]. Since f(0,0,1) 4 0, g(0,0,1) 4 0 
and f and g do not have any common factor, it is straightforward to ver- 
ify from the expression of the resultant as the determinant of the Sylvester 
matrix, that the resultant Resa, a,(f,g) with respect to z is a non zero ho- 
mogeneous polynomial in x,y of total degree d; - do. Write pj; = (X;, yi, 2). 
Then, Resg, a.(f,9)(%i, yi) = 0 for allt =0,...,d, - dp. The fact that (0,0, 1) 
does not lie in any of the lines L,; implies that the (didz + 1) points (aj, y:) 
are distinct, and we get a contradiction. 


1.4.4 GCD computations and Bézout identities 


Let P,Q be two univariate polynomials with coefficients in a field k. Assume 
they are coprime, i.e. that their greatest common divisor GCD(P,Q) = 1. 
We can then find polynomials hy,h2 € k[z] such that the Bézout identity 
1=h,P+he2Q is satisfied, by means of the Euclidean algorithm to compute 
GCD(P, Q). A we have already remarked, GCD(P, Q) = 1 if and only if P 
and Q do not have any common root in any algebraically field K contain- 
ing k. If d,,d2 denote the respective degrees, this happens precisely when 
Resa, ,d.(P,Q) # 0. Note that since the resultant is an integer polynomial 
in the coefficients, Resg, a,(P,@Q) also lies in k. Moreover, by property iii) in 
Section 1.3.2, one deduces that 
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A, Ag 
= P4 : 1.26 
Resa, ,d2 (e Q) @ ( ) 


Resa, ds (P, Q) 
So, it is possible to find h,,h2 whose coefficients are rational functions with 
integer coefficients evaluated in the coefficients of the input polynomials P, Q, 
and denominators equal to the resultant. Moreover, these polynomials can be 
explicitly obtained from the proof of (1.23). In particular, the coefficients of 
Aj, Ag are particular minors of the Sylvester matrix Ma, ay. 

This has also been extended to compute GCD(P, Q) even when P and Q 
are not coprime (and the resultant vanishes), based on the so called subresul- 
tants, which are again obtained from particular minors of Mg, ¢,. Note that 
GCD(P, Q) is the (monic polynomial) of least degree in the ideal generated by 
P and Q (i.e. among the polynomial linear combinations h; P + hgQ). So one 
is led to study non surjective specializations of the linear map (1.21). In fact, 
the dimension of its kernel equals the degree of GCD(P, Q), i-e. the number 
of common roots of P and Q, counted with multiplicity. 

Note that if 1 < dy < d; and C= Slr cz' is the quotient of P in the 
Euclidean division by Q, the remainder equals 


1 


d,—dz2 


R=P— YS e(2'Q). 


i=0 


Thus, subtracting from the first column of Mg, a, the linear combination of the 
columns corresponding to z'Q,i =0,...,d, — dz, with respective coefficients 
cj, we do not change the determinant but we get the coefficients of R in the 
first column. In fact, it holds that 


Raydo (P,Q) = ee NT ay a Q). 


So, one could describe an algorithm for computing resultants similar to the 
Euclidean algorithm. However, the Euclidean remainder sequence to compute 
greatest common divisors has a relatively bad numerical behavior. Moreover, 
it has bad specialization properties when the coefficients depend on para- 
meters. Collins [Col67] studied the connections between subresultants and 
Euclidean remainders, and he proved in particular that the polynomials in 
the two sequences are pairwise proportional. But the subresultant sequence 
has a good behavior under specializations and well controlled growth of the 
size of the coefficients. Several efficient algorithms have been developed to 
compute subresultants [LRDOO]. 


1.4.5 Algebraic numbers 


A complex number a is said to be algebraic if there exists a polynomial P € 
Q[z] such that P(a) = 0. The algebraic numbers form a subfield of C. This 
can be easily proved using resultants. 
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Lemma 1.4.3. Let P,Q € Q|z] with degrees di, dz and let a, € C such that 
P(a) = Q(6) =0. Then, 


i) a+ is a root of the polynomial u+(z) = Resa, a.(P(z — y), Q(y)) = 0, 
ii) a - B is a root of the polynomial u,.(z) = Resa, a.(y“ P(z/y), Q(y)), 
tii) for a #0, ag! is a root of the polynomial u_y(z) = Resa, ,a.(zy—1, P(y)), 


where the resultants are taken with respect to y. 


The proof of Lemma 1.4.3 is immediate. Note that even if P (resp. Q) is 
the minimal polynomial annihilating a (resp. 3), ie. the monic polynomial 
with minimal degree having a (resp. 3) as a root, the roots of the polynomial 
ux are all the products a; - 3; where a; (resp. 3;) is any root of P (resp. Q), 
which need not be all different, and so u,, need not be the minimal polynomial 
annihilating a- 3. This happens for instance in case a = V2, P(z) = 27-2, 8 = 
/3, Q(z) = 2? — 3, where u,(z) = (z? — 6)?. 


1.4.6 Discriminants 


Given a generic univariate polynomial of degree d, P(z) = ap + az + 
--» + agz4, aq # O, it is also classical the existence of an irreducible poly- 
nomial Da(P) = Da(ao,...,@a) € Zlao,...,aa], called the discriminant (or 
d-discriminant) whose value at a particular set of coefficients (with aq 4 0) 
is non-zero if and only if the corresponding polynomial of degree d has only 
simple roots. Equivalently, Da(ao,...,@n) = 0 if and only if there exists z € C 
with P(z) = P’(z) =0. 
Geometrically, the discriminantal hypersurface 


{a =(ao,--.,@q) € C7: Da(a) = 0} 


is the projection over the first (d+ 1) coordinates of the intersection of the 
hypersurfaces {(a,z) € C4+?: a9 taiz +--+: + agz4 = 0} and {(a,z) € 
Ct? : ay + 2agz+---+dagz4! = 0}, ie. the variable z is eliminated. 

The first guess would be that Da(P) equals the resultant Resga—1(P, P’), 
but it is easy to see that in fact Resga—i(P, P’) = (-1)“*-)/? agDa(P). In 
case d = 2, P(z) = az? + bz +, Do(a,b,c) is the well known discriminant 
b? — 4ac. When d = 6 for instance, Dg is an irreducible polynomial of degree 
10 in the coefficients (ao,...,a@¢) with 246 terms. 

The extremal monomials and coefficients of the discriminant have very 
interesting combinatorial descriptions. This notion has important applications 
in singularity theory and number theory. The distance of the coefficients of 
a given polynomial to the discriminantal hypersurface is also related to the 
numerical stability of the computation of its roots. For instance, consider the 
Wilkinson polynomial P(z) = (z+1)(z+2)...(z2+19)(z+ 20), which clearly 
has 20 real roots at distance at least 1 from the others, and is known to be 
numerically unstable. The coefficients of P are very close to the coefficients of 
a polynomial with a multiple root. The polynomial Q(z) = P(z) + 107%2!%, 
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obtained by a “small perturbation” of one of the coefficients of P, has only 
12 real roots and 4 pairs of imaginary roots, one of which has imaginary 
part close to +0.887. Consider then the parametric family of polynomials 
Py(z) = P(z) + Az! and note that P(z) = Py and Q(z) = Pio-9. Thus, for 
some intermediate value of A, two complex roots merge to give a double real 
root and therefore that value of the parameter is a zero of the discriminant 
D(A) = Dao(P). 


1.5 Multidimensional residues 


In this section we will extend the theory of residues to the several variables 
case. As in the one-dimensional case we will begin with an “integral” definition 
of local residue from which we will define the total residue as a sum of local 
ones. We will also indicate how one can give a purely algebraic definition of 
global, and then local, residues using Bezoutians. We shall also touch upon 
the geometric definition of Arnold, Varchenko and Gusein-Zadé [AGZV85]. 

Let K be an algebraically closed field of characteristic zero and let I Cc 
K[@1,-.--,2n] be a zero-dimensional ideal. We denote by Z(I) = {&1,..-,&s} C 
KK” the variety of zeros of I. We will assume, moreover, that I is a complete 
intersection ideal, i.e. that it has a presentation of the form J = (Pi,..., Pn), 
P; € Kia1,...,@,]. For simplicity, we will denote by (P) the ordered n-tuple 
{Pi,...,Pn}. As before, let A be the finite dimensional commutative algebra 
A = K[a,...,%n]/I. Our goal is to define a linear map 


res(p): A> K 


whose properties are similar to the univariate residue map. In particular, we 
would like it to be dualizing in the sense of Theorem 1.2.1 and to be compatible 
with local maps resipy ¢: Ag — K, € € Z(J). 


1.5.1 Integral definition 


In case K = C, given € € Z(I), let UY Cc C” be an open neighborhood of &€ 
containing no other points of Z(I), and let H € C[xj,...,a,]. We define the 
local Grothendieck residue 


1 A(x 
res (py ,¢(H) = ome | PP th Nes (1.27) 


where I¢(e) is the real n-cycle Ie(e) = {x € U : |Pi(x)| = &} oriented by 
the n-form d(arg(P,)) \--- A d(arg(P,)). For almost every € = (€1,...,€,) in 
a neighborhood of the origin, [¢(€) is smooth and by Stokes’ Theorem the 
integral (1.27) is independent of e. The choice of the orientation form implies 
that res(p) ¢(H) is skew-symmetric on P,,..., Pn. We note that this definition 
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makes sense as long as H is holomorphic in a neighborhood of €. If € € Z(L) 
is a point of multiplicity one then the Jacobian 


Koy(é) = act (S20) 


is non-zero, and 


res; p),¢(H) = (1.28) 
This identity follows from making a change of coordinates y; = P;(x) and 
iterated integration. 

It follows from Stokes’s theorem that if H € I¢, the ideal defined by J in the 
local ring defined by € (cf. Section 2.1.3 in Chapter 2), then res;p) «(H) = 0 
and therefore the local residue defines a map res; py ¢: Ag — C. We then define 
the global residue map as the sum of local residues 


Tes( Pp) (7) = y res; p) ¢(H) 
ceZ(1) 


which we may view as a map res;p): A — C. We may also define the global 
residue res, p)(H1/H2) of a rational function regular on Z(J), ie. such that 
Hy does not vanish on Z(I). At this point one may be tempted to replace the 
local cycles [¢(€) by a global cycle 


I(e) := {# €C": |Pi(x)| = ej} 


but I’(e) need not be compact and integration might not converge. However, 
if the map 
(P,,..., Py): C” 3 C” 


is proper, then I'(€) is compact and we can write 


1 H (x) 
(Q2ni)” Jrce) Pi(z) +++ Pa(z) 


resp)(H) := dz, \---A dtp. 


The following two theorems summarize basic properties of the local and 


global residue map. 


Theorem 1.5.1 (Local and Global Duality). Let IT = (Pi,...,Pn) Cc 
C[a1,...,%n] be a complete intersection ideal and A = Cla,...,%n]/I. Let 
Ae be the localization at € € Z(I). The pairings 


A¢ x Ae = C : ([H1], [H2]) > res; py ¢(Ai - H2) 


and 
AxA—>C ; ([Hi], [Ho]) + resyp) (Ai - He) 


are non-degenerate. 
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Theorem 1.5.2 (Local and Global Transformation Laws). Let I = 
(Pi,..-,Pn) and J = (Q1,...,Qn) be zero-dimensional ideals such that J C I. 
Let 2 

Q;(z) = © ai(x)P,(z). 


1 
Denote by A(x) the n x n-matrix (a;;(x)), then for any € € Z(I), 


resp) ¢(H) = resiq),e(H - det(A)). (1.29) 
Moreover, a similar formula holds for global residues 
res(p)(H) = res g)(H - det(A)). 


Remark 1.5.8. We refer the reader to [Tsi92, Sect. 5.6 and 8.4] for a proof 
of the duality theorems and to [Tsi92, Sect. 5.5 and 8.3] for full proofs of 
the transformation laws. The local theorems are proved in [GH78, Sect. 5.1] 
and extended to the global case in [TY84]; a General Global Duality Law is 
discussed in [GH78, Sect. 5.4] Here we will just make a few remarks about 
Theorem 1.5.2. 

Suppose that € € Z(I) is a simple zero and that det(A(&)) 4 0. Then, 
since 


Jay (§) := I(py(€) - det(A(é)) 


we have 


res(p),¢(H) = Ce ee res(qy,¢(H - det(A)) , 


Jip) (€) Jia) (§) 


as asserted by (1.29). The case of non-simple zeros which are common to both 
I and J is dealt-with using a perturbation technique after showing that when 
the input {P,,...,P,} depends smoothly on a parameter so does the residue. 
Finally, one shows that if € € Z(J)\Z(I), then det(A) € Je and the local 
residue res qg) ¢(H - det(A)) vanishes. 


1.5.2 Geometric definition 


For the sake of completeness, we include a few comments about the geometric 
definition of the residue of Arnold, Varchenko and Gusein-Zadé [AGZV85]. 
Here, the starting point is the definition of the residue at a simple zero 
€ € ZI) as in (1.28). Suppose now that € € Z(J) has multiplicity yw. In a 
sufficiently small neighborhood U of € in C” we can consider the map 


P=(P,,...,Pry):U>C". 


By Sard’s theorem, almost all values y € P(U) are regular and at such points 
the equation P(x)—y = 0 has exactly yw simple roots 7 (y),.-., u(y). Consider 
the map 
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It is shown in [AGZV85, Sect. 5.18] that ¢(y) extends holomorphically to 
0 € C”. We can then define the local residue res; p) ¢(H) as the value ¢(0). A 
continuity argument shows that both definitions agree. 


1.5.3 Residue from Bezoutian 


In this section we generalize to the multivariable case the univariate approach 
discussed in Section 1.2.1. This topic is also discussed in Section 3.3 of Chap- 
ter 3. We will follow the presentation of [BCRS96] and [RS98] to which we 
refer the reader for details and proofs. We note that other purely algebraic 
definitions of the residue may also be found in [KK87, Kun86, S75, SS79]. 

Let K be an algebraically closed field K of characteristic zero and let A 
be a finite-dimensional commutative K algebra. Recall that A is said to be a 
Gorenstein algebra if there exists a linear form @ € A := Homg(A,K) such 
that the bilinear form 


de: AxA>K ; e(a,b) = &(a-d) 


is non-degenerate. Given such a dualizing linear form 2, let {a1,...,a,} and 
{b,,...,b,} be de-dual bases of A, and set 


= Sa; @ b; ECE ABA. 


i=1 


By is independent of the choice of dual bases and is called a generalized Be- 
zoutian. It is characterized by the following two properties: 


e (a@1)By = (1@a)By, for all a € A, and 
e If {a1,...,a,} is a basis of A and By = )0, a; ® bj, then {b1,...,b,} isa 
basis of A as well. 


It is shown in [BCRS96, Th. 2.10] that the correspondence £ ++ By is an 
equivalence between dualizing linear forms on A and generalized Bezoutians 
nA®A. 

As in Section 1.2.5 we can relate the dualizing form, the Bezoutian and 
the computation of traces. The dual A may be viewed as a module over A 
by a: A(b) := A(a- b), a,b € A,A € A. A dualizing form £ € A generates A 
as an A-module. Moreover, it defines an isomorphism A —> A, a+ ¢(ae). In 
particular there exists a unique element Jy € A such that tr(M,) = (Je: q), 
where M,: A — A denotes multiplication by q € A. On the other hand, if 
{ay,.. ao and {bi,...,6,} are ¢p-dual bases of A, then 


M,(aj)=4- a; = Yo (q- a;, 0; 
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and therefore 


tr(M,) = 3 de(q- a;, bj) = 3 l(q-a;- bi) = €(q- 5 a;bi) 
i=1 i=1 i=1 
from which it follows that 
Je = Se db. (1.30) 
i=1 
Note that, in particular, 
(Je) = we -b;) = r = dim(A). (1.31) 
i=1 
Suppose now that J C K{21,...,2,] is a zero-dimensional complete inter- 


section ideal. We may assume without loss of generality that I is generated by 
a regular sequence {P;,...,P,}. The quotient algebra A = K[x,...,2n]/I 
is a Gorenstein algebra. This can be done by defining directly a dualizing 
linear form (global residue or Kronecker symbol) or by defining an explicit 
Bezoutian as in [BCRS96, Sect. 3]: 

Let 


a; P; _ Prison Ui Ag E yt cag) a PAY s 0 Uy MGs vf a) (1.32) 
5 — Yj 


and set 


We shall also denote by A,p)(2,y) its image in the tensor algebra 


A@A & K[z,y]/(Pi(2),..., Pax), Pi(y),.--,Paly)) - (1.34) 


Remark 1.5.4. In the analytic context, the polynomials 0;P; are the coeffi- 
cients of the so called Hefer expansion of P;. We refer to [TY84] for the 
relationship between Hefer expansions and residues. 


Theorem 1.5.5. The element A;p) (x,y) € A@A is a generalized Bezoutian. 


This is Theorem 3.2 in [BCRS96]. It is easy to check that A;p) satisfies the 
first condition characterizing generalized Bezoutians. Indeed, given the iden- 
tification (1.34), it suffices to show that [f(«)]- Apy(x, y) = [f(y)]- Arey (a, y) 
for all [f] € A. This follows directly from the definition of A;p). The proof 
of the second property is much harder. Becker et al. show it by reduction to 
the local case where it is obtained through a deformation technique somewhat 
similar to that used in the geometric case in [AGZV85]. 
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We denote by 7 the Kronecker symbol; that is, the dualizing linear form 
associated with the Bezoutian A:p). As we shall see below, for K = C, the 
Kronecker symbol agrees with the global residue. In order to keep the context 
clear, we will continue to use the expression Kronecker symbol throughout 
this section. 

If H, / Hg is a rational function such that H2 does not vanish on Z(I), then 
[H>] has an inverse [G2] in A and we define 7(Hi/H2) := 7([Mi] - [Go]). 

If {[x]} is a monomial basis of A and we write 


)= So aA 


then {[x°]} and {{Aq(a)]} are dual basis and it follows from (1.30) and (1.34) 


that 
Rats) se) = > Ade) = Api (a,2): 


P; 
Since lim 0;P,(z,y) = an it follows that J;p)(x) agrees with the standard 


j 
Jacobian of the polynomials P,,..., P,. As we did in Section 1.1.2 for univari- 
ate residues, we can go from the global Kronecker symbol to local operators. 
Let Z(2) = {&,...,€s} and let 


T= Neezunle 


be the primary decomposition of J as in Section 2.1.3 of Chapter 2. Let Ag = 
K[v1,...,2n]/Le, we have an isomorphism: 


= [I «. 


€€Z(L) 


We recall (cf. [CLO98, Sect. 4.2]) that there exist emp ote ee € K[x1,...,2n| 
such that, in A, icezi1 ee = 1, eg,ee, =O if i AJ, and ez = = 1. These gener- 
alize the ieepolanne poly nimnials we discussed in Section 1.1.2. We can now 
define 

Te([H]) := T(ee - [H]) 
and it follows easily that the global Kronecker symbol is the sum of the local 
ones. In analogy with the global case, we may define the local Kronecker 
symbol t¢([H; /H2]) of a rational function H) /H2, regular at € as T¢([.H1]-[G2]), 
where [G2] is the inverse of [H2] in the algebra Ag. The following proposition 
shows that in the case of simple zeros and K = C, the Kronecker symbol 
agrees with the global residue defined in Section 1.5.1. 


Proposition 1.5.6. Suppose that Jip)(€) #0 for all € € Z(I). Then 


1(H) = HS) (1.35) 


venus Jip) (€) 


for all H € K[a1,..., 2n]. 
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Proof. Recall that the assumption that Jp)(€) #0 for all € € Z(J) implies 


that [Jp)] is invertible in A. Indeed, since Jip), P,,..., Pn have no common 
zeros in K”, the Nullstellensatz implies that there exists G € K[21,...,2n] 
such that 

G.J(p) =1 mod. 
Given H € K{a,...,%,], consider the trace of the multiplication map 
My.q: A— A. On the one hand, we have from Theorem 2.1.4 in Chapter 2 
that H(é) 

tr(Mua) = )) H(g)G() = Ton 
g€Z(I) eez(1) 4?) 


But, recalling the definition of the Jacobian we also have 
tr(Mzy.c) = T(J¢py -G- Hf) = T(H) 
and (1.35) follows. 


Remark 1.5.7. As in the geometric case discussed in Section 1.5.2 one can use 
continuity arguments to show that the identification between the Kronecker 
symbol and the global residue extends to the general case. We refer the reader 
to [RS98] for a proof of this fact as well as for a proof of the Transformation 
Laws in this context. In particular, Theorem 1.5.2 holds over any algebraically 
closed field of characteristic zero. 


1.5.4 Computation of residues 


In this section we would like to discuss briefly some methods for the com- 
putation of global residues; a further method is discussed in Section 3.3.1 in 
Chapter 3. Of course, if the zero-dimensional ideal I = (P,,...,P,) is radi- 
cal and we can compute the zeros Z(J), then we can use (1.28) to compute 
the local and global residue. We also point out that the transformation law 
gives a general, though not very efficient, algorithm to compute local and 
global residues. Indeed, since J is a zero dimensional ideal there exist univari- 
ate polynomials f;(x1), fo(a2),..-, fn(@n) in the ideal J. In particular we can 


write 
nm 


f(a) = > a@Pnw@ 


i=l 


and for any H € K[21,...,2n], 
resp)(H) = res; p)(H - det(aj;)). (1.36) 


Moreover, the right hand side of the above equation may be computed as 
an iterated sequence of univariate residues. What makes this a less than de- 
sirable computational method is that even if the polynomials P,,...,P, and 
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fi,---, fn are very simple, the coefficients a;; (x) need not be so. The following 
example illustrates this. 
Consider the polynomials 


P= A — 3 

P, = L2—- wy 02 (1.37) 
P; = 23 — 3 

The ideal I = (P,, Po, P3) is a zero-dimensional ideal; the algebra A has di- 
mension four, and the zero-locus Z(I) consists of two points, the origin, which 
has multiplicity three, and the point (1,1, 1). Grébner basis computations with 
respect to lexicographic orders give the following univariate polynomials in the 
ideal I: 


fi=o{—ai 


3 2 
fg = @3 — £3. 


We observe that we could also have used iterated resultants to find univariate 
polynomials in I. However, this will generally yield higher degree polynomials. 
For instance, for our example (1.37) a Singular [GPSO1] computation gives: 


>resultant (resultant (P_1,P_2,x_3) ,resultant (P_2,P_3,x_3) ,x_2); 
x_1710-2*x_179+x_178 


Returning to the polynomials (1.38), we can obtain, using the Singular com- 
mand “division”, a coefficient matrix A = (a;;(x)): 


xy +23 03+ (a7 +a, 4+1)03 + (a7 +21 4+ 22)23 + 7x2 (a1 + l)r3 4+ 27 


0 x? +a,—1 0 

1 (21 +1)(x2 + 23) + 23 Ly +23 
So that 

det(A) = (ao +23 —1)a3 + (atx + 2? — a3 — a? — 22 +-1)234 


2 2 
= x2 t x x x + £{ — X22. 


Rather than continuing with the computation of a global residue res; p) (H) 
using (1.36) and iterated univariate residues or Bezoutians, we will refer the 
reader to Chapter 3 where improved versions are presented and discuss instead 
how we can use the multivariate Bezoutian in computations. The Bezoutian 
matrix (0;P;) is given by 


rity —23 —(a7 + aiyi + yj) 
0 1 0 
-1 -yi(x3 + ys) r3 + Y3 


and therefore 
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A(p)(2,y) = 21%3 + 21y3 + t341 + sys — Ft — F1y1 — Yj - 


Computing a Grdébner basis relative to grevlex gives a monomial basis of A 
of the form {1, 41, x2, 73}. Reducing A;p)(x,y) relative to the corresponding 
basis of A @ A we obtain: 


Aipy(z,y) = (ye — ys) + (ys — y1)@1 + 22+ (y1 — 123. 


Hence the dual basis of {1, 21, 22,23} is the basis {x2 — 23,73 — 41, 1,4, — 1}. 
We now claim that given H © K[x,...,2,], if we compute the grevlex 
normal form: 
N(Hf) = Xo + A1[x1] + A2[x2] + A3[x3| 


then, res;p)(H) = Az. More generally, suppose that {[x°]} is a monomial 
basis of A and that {[A,(x)]} is the dual basis given by the Bezoutian, then 


if [H] = >, Ae[2®] and 1 = >, HalAal], 


res(p)(H) = > alta: (1.39) 
Indeed, we have 


TeS( Pp) (H) = res(p) (H : 1) => TeS(P) es Age -S7 upAg) 
a B 
= yy AaMares,py)(x*- Ag) = x. Nall « 
a,p a 


Although the computational method based on the Bezoutian allows us to 
compute res;p)(H) as a linear combination of normal form coefficients of H, 
it would be nice to have a method that computes the global residue as a 
single normal form coefficient, generalizing the univariate algorithm based on 
the identities (1.10). This can be done if we make some further assumptions 
on the generators of the ideal J. We will discuss here one such case which 
has been extensively studied both analytically and algebraically, following 
the treatment in [CDS96]. A more general algorithm will be presented in 
Section 1.5.6. Assume the generators P,,..., Pp satisfy: 


Assumption: P,,...,P, are a Grébner basis for some term order ~. 


Since we can always find a weight w € N” such that in,(P;) = in.(P), 
i= 1,...,n, and given that I is a zero dimensional ideal, it follows that, up 
to reordering the generators, our assumption is equivalent to the existence of 
a weight w such that: 

in, (P;) =e; 20°"? (1.40) 


7 
It is clear that in this case dimg(A) = 11---Tn, and a monomial basis of A is 
given by {[7°] :0 < a; < rj}. 
We point out that, for appropriately chosen term orders, our assumption 
leads to interesting examples. 
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e Suppose ~ is lexicographic order with x, ~ --- ~ 21. In this case 
P, = cat? + Pi(aj,...,;2n) 


and deg,,.(P/) <r. This case was considered in [DS91]. 
e Let ~< be degree lexicographic order with x; <--- < x,. Then 


jf 
Ailey = ae oe bij (x) + yi(z), 
j=l 


where deg(¢;;) = r; and deg(w;(x)) < r;. This case has been extensively 
studied by the Krasnoyarsk School (see, for example, [AY83, Ch. 21] and 
[Tsi92, II.8.2]) using integral methods. Some of their results have been 
transcribed to the algebraic setting in [BGV02] under the name of Pham 
systems of type II. 


Note also that the polynomials in (1.37) satisfy these conditions. Indeed, 
for w = (3, 14,5) we have: 


inw(Pi)= 272, iny(P2)=22, inw(P3) = 23 (1.41) 


The following theorem, which may be viewed as a generalization of the 
basic univariate definition (1.1), is due to Aizenberg and Tsikh. Its proof may 
be found in [AY83, Ch. 21] and [CDS96, Th. 2.3]. 


Theorem 1.5.8. Let P,,...,P, € C[x1,...,&p] satisfy (1.40). Then for any 
H € C[a1,...,@n] resyp)(H) is equal to the ———— -coefficient of the Laurent 


£44 By 
series expansion of: 


H(x) 1 
Ikea?" I] c + maya) (1.42) 


obtained through geometric expansions. 


The following simple consequence of Theorem 1.5.8 generalizes (1.10) and 
is the basis for its algorithmic applications. 


Corollary 1.5.9. Let Py,...,P, € C[a1,...,2n] satisfy (1.40) and let {[a%] : 


0< a; < rj} be the corresponding monomial basis of A. Let w= (11,.--,1n); 
then 

: oie 0 ifaAxAyu 

resi") =| 2 aah (1.43) 


Remark 1.5.10. A proof of (1.43) using the Bezoutian approach may be found 
in [BCRS96]. Hence, Corollary 1.5.9 may be used in the algebraic setting as 
well. 


As in the univariate case, we are led to the following algorithm for com- 
puting residues when P,,...,P,, satisfy (1.40). 
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Algorithm 1: Compute the normal form N(#) of H € K[x1,...,2»] relative 
to any term order which refines w-degree. Then, 
res(p)(H) = —H—, (1.44) 


Ci °° Cn 


where a,, is the coefficient of x” in N(H). 


Remark 1.5.11. Given a weight w for which (1.40) holds it is easy to carry the 
computations in the above algorithm using the weighted orders wp (weighted 
grevlex) and Wp (weighted deglex) in Singular [GPSO1]. For example, for the 
polynomials in (1.37), the Jacobian Jip)(«) = 42123 — 32} and we get: 
> ring R = 0, (x1, x2, x3), wp(3,14,5); 
> ideal I = x1°2-x3, x2-x1*x3°2, x3°2 - x173; 
> reduce (4*x1*x3 - 3*x172,std(I)); 

4*x1*x3-3*x3 


Thus, the 2123 coefficient of the normal form of Jip) (x) is 4, i.e. dimg(A) as 
asserted by (1.30). 


1.5.5 The Euler-Jacobi vanishing theorem 


We will now discuss the multivariate extension of Theorem 1.1.8. The basic 
geometric assumption that we need to make is that if we embed C” in a suit- 
able compactification then the ideal we are considering has all its zeros in 
C”. Here we will restrict ourselves to the case when the chosen compactifica- 
tion is weighted projective space. The more general vanishing theorems are 
stated in terms of global residues in the torus and toric compactifications as 
in [Kho78a]. 

Let w € N” and denote by deg,,, the weighted degree defined by w. We set 
|w| = 0, wi. Let J = (P,,..., Pn) be a zero-dimensional complete intersection 
ideal and write 

P(x) = Qi(x) + Pi(a), 
where Q;(x) is weighted homogeneous of w-degree d; and deg,,(P/) < di. 
We call Q; the leading form of P;. We say that I has no zeros at infinity in 
weighted projective space if and only if 


Qi(z) = +++ = Qn(xz) = 0 ifandonlyif « = 0. (1.45) 


In the algebraic context an ideal which has a presentation by generators 
satisfying (1.45) is called a strict complete intersection [KK87]. 


Theorem 1.5.12 (Euler-Jacobi vanishing). Let I = (Pi,...,P,) be a 
zero-dimensional complete intersection ideal with no zeros at infinity in 
weighted projective space. Then, 


res(p\(H) = 0 if deg,,(H) < S-deg,, (Pi) —|w. 
i=1 
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Proof. We begin by proving the assertion in the particular case when Q; (x) = 
r+, By linearity it suffices to prove that if 2% is a monomial with (w,a) < 
N|w|, then res;p)(a*) = 0. We prove this by induction on 6 = (w, a). If 6 = 0 
then «* = 1 and the result follows from Corollary 1.5.9. Suppose then that 
the result holds for any monomial of degree less than 6 = (w,a), if every 
a; < N then the result follows, again, from Corollary 1.5.9. If, on the other 
hand, some a; > N then we can write 


a® = 2? .P,-29. Pl, 


where 3 = a—(N + 1)e;. It then follows that res;p)(2*) = —res,p) (x? - P!), 
but all the monomials appearing in the right-hand side have weighted degree 
less than 6 and therefore the residue vanishes. 

Consider now the general case. In view of (1.45) and the Nullstellensatz 
there exists N sufficiently large such that 


a € (OB) yeaioglede GE) ) « 


In particular, we can write 


n 


att = SS ai(x)Qi(x), 


i=1 
where a;;(x) is w-homogeneous of degree (N + 1)w,; — d;. Let now 


n 


F(a) = S° ag(a)Pi(e) = of + FG), 
i=1 
and deg,,(F;) < (N + 1)w;. Given now H € K[x1,...,%n] with deg,,(H) < 


, a; — |w|, we have by the Global Transformation Law: 


res(p)(H) = res;py(det(aij) - H). 
But, deg,,(det(aij)) < (N + 1)|w| — 0, d; and therefore 
deg, (det(ai;) - H) < deg,,(det(a;;)) + deg,,(H) < Nw], 
and the result follows from the previous case. 


Remark 1.5.13. The Euler-Jacobi vanishing theorem is intimately connected 
to the continuity of the residue. The following argument from [AGZV85, Ch. 1, 
Sect. 5] makes the link evident. Suppose P,,...,P, have only simple zeros 
and satisfy (1.45). For simplicity we take w = (1,...,1), the general case is 
completely analogous. Consider the family of polynomials 


P,(a;t) = t% P,(t7121,...,t71 an). (1.46) 


1 Residues and resultants 39 


Note that P;(t-2,t) = t% P,(x). In particular if P;(€) = 0, P,(té;t) = 0 as 
well. Suppose now that deg(H) < 5°, d; —n and let H(z; t) be defined as in 
(1.46). Then 


res, py (H) = x ae = {% >» AS) = t"res;p)(H) , 


ber) Jac, py 


where a = deg(H) — deg(Jac,py(x)) = deg(H) — (D0, dj — n). Hence, if a < 0, 
the limit . 
lim res (py (H) 


may exist only if res;p)(H) = 0 as asserted by the Euler-Jacobi theorem. 


We conclude this subsection with some applications of Theorem 1.5.12 to 
plane projective geometry (cf. [GH78, 5.2]). The following theorem is usually 
referred to as the Cayley-Bacharach Theorem though, as Eisenbud, Green, 
and Harris point out in [EGH96], it should be attributed to Chasles. 


Theorem 1.5.14 (Chasles). Let C, and Cz be curves in P?, of respective 
degrees d, and dz, intersecting in didz distinct points. Then, any curve of 
degree d = d, + dy —3 that passes through all but one of the points in Cy M Co 
must pass through the remaining point as well. 


Proof. After a linear change of coordinates, if necessary, we may assume that 
no point in C; M C2 lies in the line x3 = 0. Let Cj = { P;(x1, £2, £3) = 0}, 
deg P; = d;. Set P;(a1, 22) = P;(a1, 72,1). Given H € K[21, x2, 23], homoge- 
neous of degree d, let H € K{21, 22] be similarly defined. We can naturally 
identify the points in C, M C2 with the set of common zeros 


Z={€€K? : P(E) = Po(€) = 0}. 


Since deg H < deg P, +deg P2—2, Theorem 1.5.12 implies that res;p) (H) = 0, 
but then 2 
0 = resp) (fH) = Face (2) (6) 
fez ac P) (£) 
which implies that if H vanishes at all but one of the points in Z it must 
vanish on the remaining one as well. 


Corollary 1.5.15 (Pascal’s Mystic Hexagon). Consider a hexagon in- 
scribed in a conic curve of P?. Then, the pairs of opposite sides meet in 
collinear points. 


Proof. Let L1...L6 denote the hexagon inscribed in the conic Q C P?, where 
L, is a line in P?. Let &; denote the intersection point L;M L;. Consider the 
cubic curves C) = £1+03+Ls5 and Co = Lo+L4+L¢. The intersection CyNC > 
consists of the nine points €;; with i odd and j even. The cubic Q+ L(€14&36), 
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where L(£4€36) denotes the line joining the two points, passes through eight 
of the points in C, M Cg hence must pass through the ninth point £52. For 
degree reasons this is only possible if 52 € L(€14€36) and therefore the three 
points are collinear. 


1.5.6 Homogeneous (projective) residues 


In this section we would like to indicate how the notion or residue may be 
extended to meromorphic forms in projective space. This is a special instance 
of a much more general theory of residues in toric varieties. A full discussion 
of this topic is beyond the scope of these notes so we will restrict ourselves to 
a presentation of the basic ideas, in the case K = C, and refer the reader to 
[GH78, TY84, PS83, Cox96, CCD97] for details and proofs. 

Suppose Fo,...,F, € C[ao,...,@n] are homogeneous polynomials of de- 
grees do,...,dn, respectively. Let Vi = {a € P” : Fi(x) = 0} and assume 
that 

VonV,-:-AV, = O. (1.47) 


This means that the zero locus of the ideal J = (Fo,..., Fn) is the origin 
0 € C™t!. Given any homogeneous polynomial H € C[zo,...,2%n] we can 
define the projective residue of H relative to the n+1-tuple (F) = {Fo,..., Fn} 
as: 

Tes\ pr) (H) := resip)(H) = res;p) o(#). 
It is clear from the integral definition of the Grothendieck residue, that the 
local residue at 0 is invariant under the change of coordinates 1; + Axi, 
 € C*. On the other hand, if deg(H) = d we see that, for 


n 


p= S/(d,-1), 


i=0 
H(\- 2) 
Fo(A- a) +++ Fy(A+ 2) 


Hence, 


~ MEP (ae) 
a Fo(x) arse F(x) 


(Ax) A+++ A d(Aan) dty \-+:Ad&p,. 


res(m(H) = 0 if deg(H) 4p. 


Being a global (and local) residue, the projective residue is a dualizing 
form in the algebra A = C[zo,...,%n]/I. Moreover, since I is a homogeneous 
ideal, A is a graded algebra and the projective residue is compatible with the 
grading. These dualities properties are summarized in the following theorem. 


Theorem 1.5.16. The graded algebra A = GAg satisfies: 


a) Ag=0 ford> p:=dgt-:-+dy,—(n+4+1). 
b) A, SC. 
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c) For0<d< , the bilinear pairing 
Aix Ap-a > C;  ((Ai], [Ha]) > res(m (Hi - Ha) 
is non-degenerate. 


Proof. The assumption (1.47) implies that Fo,...,F, are a regular sequence 
in the ring C[xo,...,£,]. Computing the Poincaré series for A using the exact- 
ness of the Koszul sequence yields the first two assertions. See [PS83, Sect. 12] 
for details. A proof using residues may be found in [Tsi92, Sect. 20]. The last 
assertion follows from Theorem 1.5.1. 


An important application of Theorem 1.5.16 arises in the study of smooth 
hypersurfaces Xp = {x € P” : F(x) = O}, of degree d, in projective space 
[CG80]. In this case we take F; = OF /Ox;, the smoothness condition means 
that {Fo,...,F,} satisfy (1.47), and the Hodge structure of X may be 
described in terms of the Jacobian ideal generated by {OF /Ox;}. Indeed, 
p =(n+1)(d— 2) and setting, for 0 <p<n-1, d(p) := d(p+1)—(n+1), 
we have d(p) + 6(n — 1—/p) = p, and 


HP? ”-l-P(X) & Ages: 
Moreover, the pairing 
Lest p>) t Asw) * Atma C 
corresponds to the intersection pairing 
EEPPE POC) SEO PRY es 


The projective residue may be related to affine residues in a different way. If 
we identify C” = {a € P” : ap # 0}, then after a linear change of coordinates, 


if necessary, we may assume that for every 7 = 0,...,n, 
Zo = BN Vne- nV, © Ct. (1.48) 


Let P; € C[ay,...,2,] be the polynomial P;(a1,...,%n) = Fj(1,01,...,¢n) 
and let us denote by (P*) the n-tuple of polynomials Po,..., P;-1, Pit1,---;Pn- 


Theorem 1.5.17. For any homogeneous polynomial H € C[{ao,...,@n] with 
deg(H) <p, . 
res(py(H) := (—1)'res;piy(h/Pi) , (1.49) 


where h(x1,...,%n) = H(1,21,...,2n). 


Proof. We will only prove the second, implicit, assertion that the right-hand 
side of (1.49) is independent of i. This statement, which generalizes the iden- 
tity (1.11), is essentially Theorem 5 in [TY84]. For the main assertion we refer 
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to [CCD97, Sect. 4], where it is proved in the more general setting of simplicial 
toric varieties. 

Note that the assumption (1.47) implies that the rational function h/P; 
is regular on Z; and hence it makes sense to compute resp in(h /P;). For each 
i = 0,...,n, consider the n-tuple of polynomials in Kl, 1203p]: (Qi) = 
{Po,..., (Bi ‘ Picts, ounb als ifi <n and (Qn) = §Pionin, Py wn (Be ‘ Po)}. 
The set of common zeros of the polynomials in Q; is Z7(Q;) = Z; U Zi41. 
Hence, it follows from (1.48) that the ideal generated by the n-tuple Q; is 
zero-dimensional and has no zeros at infinity. Hence, given that deg(H) < p, 
the Euler-Jacobi vanishing theorem implies that 


0 = res(q,)(h) = S~ resyg,),¢(A) + » res(g,),¢(h) 
EZ; EE 4in1 
= S> res, piy e(h/Pi) + > LES pis) ¢ (h/ P41) 
£EZ; ECZin1 


I 


res;piy(h/Pi) + res pisty (h/Pi+1) 


and, consequently, the theorem follows. We should point out that the equality 
res(Q,),¢(h) = res; pi) ¢(h/ Pi), which is clear from the integral definition of the 
local residue, may be obtained in the general case from the Local Transfor- 
mation Law and the fact that res,pi) ¢(h/Pi) was defined as res, piy ¢(h- Qi), 


where Q; inverts P; in the local algebra At and, consequently, the aeenielt 
holds over any algebraically closed field of characteristic zero. 


We can use the transformation law to exhibit a polynomial A(x) of degree 
p with non-zero residue. Write 


n 


i=0 


and set A(x) = det(a;;(a)). Then, deg(A) = p, and 
res(~(A) = 1 (1.50) 


Indeed, let (G) denote the n + 1-tuple G = {a,...,@}. Then by the trans- 
formation law 
res(c) (1) = Tesi 7) (A) 
and a direct computation shows that the left-hand side of the above identity 
is equal to 1. 
Putting together part b) of Theorem 1.5.16 with (1.50) we obtain the fol- 
lowing normal form algorithm for computing the projective residue TES 7) (1): 


Algorithm 2: 1. Compute a Grobner basis of the ideal (Fo,..., Fn). 
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2. Compute the normal form N(H) of H and the normal form N(A) of 
A, with respect to the Grébner basis. 
N(#) 


3. The projective residue resp) (H j= N(Ay" 


Remark 1.5.18. There is a straightforward variant of this algorithm valid for 
weighted homogeneous polynomials. This more general algorithm has been 
used by Batyrev and Materov [BM02], to compute the Yukawa 3-point func- 
tion of the generic hypersurface in weighted projective P4,, w = (1, 1,2, 2,2). 
This function, originally computed in [CdlOF*94] has a series expansion 
whose coefficients have enumerative meaning. We refer to [BM02, 10.3] and 
[CK99, 5.6.2.1] for more details. 


We can combine Theorem 1.5.17 and Algorithm 2 to compute the global 
(affine) residue with respect to a zero-dimensional complete intersection ideal 
with no zeros at infinity in projective space. The construction below is a special 
case of a much more general algorithm described in [CD97] and it applies, in 
particular, to the weighted case as well. It also holds over any algebraically 
closed field K of characteristic zero. 

Let J = {Pi,..., Pr} € Kii,..., 2] be polynomials satisfying (1.45). Let 
d; = deg(P;) and denote by 


, x x 
FAR o, Bis2s 58a) = aft P(— a 


) 


gee tkcg 
vO Bat) 


the homogenization of P;. Let h(a1,...,0n) € Ki[vi,...,@n]. If d = deg(h) < 
>2,(d; — 1), then res;p)(h) = 0 by the Euler-Jacobi theorem. Suppose, then 
that d > °,(d; — 1), let H € K[zo,...,2n] be its homogenization, and let 


Fo = 2%; do =d—S-(dj-1) +1. 
w=1 


Then, d = )¢"_)(deg(F;) — 1) and it follows from Theorem 1.5.17 that 


resp) (H) —s res (po) (h/ Po) = res(py(h) . 


1.5.7 Residues and elimination 


One of the basic applications of residues is to elimination theory. The key idea 
is very simple (see also Section 3.3.1 in Chapter 3). Let J = (Pi,...,Pn) C 
K[a1,...,2n] be a zero-dimensional, complete intersection ideal. Let € = 
(€:1,---,6in) € K", 1=1,...,1, be the zeros of I. Let 1,..., 4, denote their 
respective multiplicities. Then the power sum 


(k) Wek 
sp = Dg 
4=1 
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is the trace of the multiplication map M,.:.A— A and, therefore, it may be 
J 

expressed as a global residue: 


J J 


SM = tr(M,x) = resyp)(k - Jip) (2)). 


The univariate Newton identities of Section 1.2.5 now allow us to compute 
inductively the coefficients of a polynomial in the variable x; with roots at 
€1j,---,€j € K and respective multiplicities p1,..., pur. 

We illustrate the method with the following example. Let 


3 3.42 
T= (a +a] -— #9, 0] —2p +0122). 


It is easy to check that the given polynomials are a Grobner basis for any term 
order that refines the weight order defined by w = (5,9). The leading terms are 
x?,—2x3. A normal form computation following Algorithm 1 in Section 1.5.4 


yields: 
sM =-2; s% =4; 5 =-2; 5 =0; sf =8; 5 = -20. 


For example, the following Singular [GPS01] computation shows how the val- 
ues s) and si) were obtained: 


> ring R = 0, (x1,x2), wp(5,9); 

> ideal I = x1°3 + x1°2 - x2, x1°3 - x2°2 + xi*x2; 
> poly J = -6*x172*x2+3¥*x173-4*x1*x2+5*x172+x2; 

> reduce(x1*3*J,std(I)); 
2*x172*x24+2*x1*x2+10*x172-10*x2 

> reduce(x1*4*J,std(I)); 

—-8*x1*x2-12*x172+12*x2 


Now, using the Newton identities (1.16) we may compute the coefficients of a 
monic polynomial of degree 6 on the variable x, lying on the ideal: 


d5 = 2; ag =0;3; a3 = -23; a2Q =03; a1 = 0; an = 0. 


Hence, f)(a1) = x§ + 2a? —2a3 € I. 

We refer the reader to [AY83, BKL98] for a fuller account of this elimina- 
tion procedure. Note also that in Section 3.6 of Chapter3 there is an applica- 
tion of residues to the implicitization problem. 


1.6 Multivariate resultants 


In this section we will extend the notion of the resultant to multivariate sys- 
tems. We will begin by defining the resultant of n+1 homogeneous polynomials 
in n+ 1 variables and discussing some formulas to compute it. We will also 
discuss some special examples of the so-called sparse or toric resultant. 
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1.6.1 Homogeneous resultants 


When trying to generalize resultants associated to polynomials in any num- 
ber of variables, the first problem one faces is which families of polynomials 
one is going to study, ie. which will be the variables of the resultant. For 
example, in the univariate case, fixing the degrees d,,dz2 amounts to setting 
(do,.--,@d,,b0,--+,ba,) as the input variables for the resultant Resg,a,. One 
obvious, and classical choice, in the multivariable case is again, to fix the 
degrees do,...,d, of 2 +1 polynomials in n variables, which will generally 
define an overdetermined system. If one wants the vanishing of the resultant 
7 d, to be equivalent to the existence of a common root, one realizes 
that a compactification of affine space naturally comes into the picture, in 
this case projective n-space. 
Consider, for instance, a bivariate linear system 


fo(2, y) = ap0X + ao1y + 402 
fi(x,y) = a10% + any + aig (1.51) 
fo(x, y) = 20x + G21Y + 422 


We fix the three degrees equal to 1, i.e. we have nine variables a,; (i,7 = 
0,1,2), and we look for an irreducible polynomial Res}. € Zlaij;, i,j = 
0, 1,2] which vanishes if and only the system has a solution (x,y). If such 
a solution (x,y) exists, then (2,y,1) would be a non-trivial solution of the 
augmented 3 x 3-linear system and consequently the determinant of the ma- 
trix (a;;) must vanish. However, as the following example easily shows, the 
vanishing of the determinant does not imply that (1.51) has a solution. Let 


(ty) =2+2y+1 
filz,y) =e2+2yt+2 
(z,y) =2+2y+3 


The determinant vanishes but the system is incompatible in C?. On the other 
hand, the lines defined by f;(x, y) = 0 are parallel and therefore we may view 
them as having a common point at infinity in projective space. We can make 
this precise by passing to the homogenized system 


Fo(x,y,z) =a +2y +2 
F\(x,y,z) = 0 + 2y +22 
Fa(x,y,z) = © + 2y +32, 


which has non zero solutions of the form (—2y,y,0), ie. the homogenized 
system has a solution in the projective plane P?(C), a compactification of the 
affine plane C?. 

We denote x = (x0,...,2n) and for any a = (ag,...,Qn) € N"*?, |a| = 
Oo bes Oy, £* S05” vx. e", Recall that f= 3) a,2° € k[tg)...48,] 18 
called homogeneous (of degree deg(f) = d) if |a| = d for all Ja| with a, 4 0, or 
equivalently, if for all \ €¢ k, it holds that f(A x) = A% f(x), for all x €k"*?. 
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As we already remarked in Section 1.3.1, the variety of zeros of a homogeneous 

polynomial is well defined over P”(k) = (k"*'\{O}) / ~, where we identify 

x ~ Aa, for all X € k\{0}. As before, K denotes the algebraic closure of k. 
The following result is classical. 


Theorem 1.6.1. Fiz do,...,d, € N and write F; = ia iat, i = 


1,...,n. There exists a unique irreducible polynomial 


d, (Fo,---; Fn) € Zlaig; i =0,..-,n, la] = di] 


jaiaraye 


which verifies: 


(1) ee (Fo,.--, fn) = 0 for a given specialization of the coefficients in 
k if and only if there exists x € P" (IK) such that Fo(x) =--- = F,(a) = 0. 
(ii)Resa,,....d,(2°,..., 29") = 1. 


The resultant Resq,,....¢,, depends on N variables, where N = S77". nies . 
A geometric proof of this theorem, which is widely generalizable, can be found 
for instance in [Stu98]. It is based on the consideration of the incidence variety 


Z = {((a@ia),2) € KN x P"(K) : > Ojo t= lise eg tt}, 


ja|=d; 


and its two projections to KY and P”"(K). In fact, Z is an irreducible variety 
of dimension N — 1 and the fibers of the first projection is generically 1 — 1 
onto its image. 

As we noticed above, in the linear case do = --- = dy, = 1, the resultant 
is the determinant of the linear system. We now state the main properties of 
multivariate homogeneous resultants, which generalize the properties of deter- 
minants and of the univariate resultant (or bivariate homogeneous resultant) 
in Section 1.3.2. The proofs require more background, and we will omit them. 


Main properties 


i) The resultant Resq,,....a, is homogeneous in the coefficients of F; of de- 
gree dg... d;_1d;41...dy, i.e. by Bézout’s theorem, the number of generic 


common roots of Fo =--- = Fy-1 = Finn =: = Fy = 0. 

ii) The resultants Resay,....4;. adjy.4d, ANd ReSqo,...,d;...,d;,...,dn coincide up to 
sign. 

iii) For any monomial «7 of degree |y| greater than the critical degree p := 
yg (di — 1), there exist homogeneous polynomials Ap,..., An in vari- 
ables 29,...,%p with coefficients in Z[(a;.,)] and deg(A;) = |y| — d;, such 
that 


Resa, say d, (Fo,---;Fn) ots AoFo + --»+ Ap Fy. (1.52) 
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Call fi(ai,...,@n) = Fi(1,a1,...,¢n) € k[x1,...2,] the dehomogeniza- 
tions of Fo,..., F,. One can define the resultant 


RES) ads (J0ss 104 fa) = BOB diccad, Pox ssa Pa) 


and try to translate to the affine setting these properties of the homogeneous 
resultant. We point out the following direct consequence of (1.52). Taking 
y = (9 + 1,0,...,0) and then specializing x9 = 1, we deduce that there 
exist polynomials Ag,..., An € Z[(aia)][%1,---, Ln], with deg(A;) bounded by 
ptl-d,= jes Ui —n, and such that 


Resig od, (fois In) = Aofob=** + Anta: (1.53) 


As we remarked in the linear case, the resultant Resa, a, (fo,---; fn) 
can vanish even if fo,..., fn do not have any common root in K” if their 
homogenizations Fo,..., F, have a nonzero common root with zp = 0. Denote 
by fia, = Fi(0,21,...,%n) the homogeneous component of top degree of each 
f;. The corresponding version of Proposition 1.3.2 is as follows. 


Proposition 1.6.2. (Homogeneous Poisson formula) Let Fo,...,F;, be 
homogeneous polynomials with degrees do,...,dn and let f;(x1,...,%n) and 
fi,a,(€1,---,%n) as above. Then 


Resa, salar d,(Fo,---,Fn)) = Resa, geeey PRCT eee Oe Os II fo(€)™, 
eV 


where V is the common zero set of f1,..-, fn, and me denotes the multiplicity 
of E€ EV. 


This factorization holds in the field of rational functions over the co- 
efficients (ajq). Stated differently, the product Heev fo(€)'* is a rational 
function of the coefficients, whose numerator is the irreducible polynomial 
Resapo,...,d,(Fo,---; fn) and whose denominator is the do power of the irre- 
ducible polynomial Reszg,.....a, (f1,d,;--+;fn,d,); Which only depends on the 
coefficients of the monomials of highest degree d),...,d,, of fi,..., fn. Note 
that taking Fo = xo we get, in particular, the expected formula 


Resi. 45... (tay Pig at P= Regal Fig) «< 44 didn (1.54) 


Another direct consequence of Proposition 1.6.2 is the multiplicative pro- 
perty: 
Rese ait dy,..dn (Fo * Fo, Fiy-++s Fn) = (1.55) 


dj Png 3009 Fn) - Resa ay a, (Po Pisses Pn) 


reer jibes! 


Resa’, dy 


where Fj}, Fi) are homogeneous polynomials of respective degrees dj, dj. More 
details and applications of the homogeneous resultant to study V and the 
quotient ring by the ideal (f,,..., fn) can be found in 2, Section 2.3.2. 
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Some words on the computation of homogeneous resultants 


When trying to find explicit formulas for multivariate resultants like the 
Sylvester or Bézout formulas (1.22) (1.25), one starts searching for maps as 
(1.21) which are an isomorphism if and only if the resultant does not van- 
ish. But this is possible only in very special cases or low dimensions, and 
higher linear algebra techniques are needed, in particular the notion of the 
determinant of a complex [GKZ94]. Given do,...,dn, the first idea to find a 
linear map whose determinant equals the resultant Resg,,....4,(Fo,---, Fn), is 
to consider the application 


Spti-do X *** X Sp+1—dn —> Spti (1.56) 

(Gg,«2+,Ga) > Goo +++: + GnFn; , 
where we denote by 5 the space of homogeneous polynomials of degree @ and 
we recall that p+ 1=dp+---+d,—n. 

For any specialization in K of the coefficients of Fo,..., F, (with respec- 
tive degrees do,...,dn), we get a K-linear map between finite dimensional 
K-vector spaces which is surjective if and only if Fo,...,F, do not have any 
common root in K"*! \ {0}. But it is easy to realize that the dimensions are 
not equal, except if nm = 1 or dog = --- = dy, = 1. Macaulay [Mac02, Mac94 
then proposed a choice of a generically non zero maximal minor of the cor- 
responding rectangular matrix in the standard bases of monomials, which 
exhibits the multivariate resultant not as a determinant but as a quotient of 
two determinants. More details on this can be found in Chapters 2 and 3; see 
also [CLO98}. 

We now recall the multivariate Bezoutian defined in Section 1.5 (cf. also 
Chapter 3). 

Let Fo,...,F, polynomials of degrees do,...,d,. Write x = (a0,..-.,2n), 
y = (Yo,-+-+ Yn) and let F(x) — F;(y) = = F,;(x, y)(2;—y;), where F,; are 
homogeneous polynomials in 2(n+1) variables of degree d; —1. The Bezoutian 
polynomial A) is defined as the determinant 


A(ry(#,y) = det((Fij(x,y))) = S> Aa(x)y®. 


lal<p 
For instance, we can take as in (1.32) 
Fij(2,y) = (Fi(yo, «+5 Yg—1, 25, --+)2n) — Fi(yo, «++ 5 Yj. Pj41,+-+,2n)) /(@5 — yy). 


This polynomial is well defined modulo (Fo(a) — Fo(y),.--,Fn(a) — Fr(y)). 
Note that the sum of the degrees deg(A,,) + |a| equals the critical degree 
p = >_,(d; —1). In fact, for any specialization of the coefficients in k such that 
Rag,...,d, (Fo,---; Fn) is non zero, the specialized polynomials {Aq, |a| = m} 
give a system of generators (over k) of the classes of homogeneous polynomials 
of degree m in the quotient k[zo,...,%n]/(Fo(«),..-,; Fn(x)), for any m < p. 
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In particular, according to Theorem 1.5.16, the graded piece of degree p of 
the quotient has dimension one and a basis is given by the coefficient 


On the other side, by (1.52), any homogeneous polynomial of degree at least 
p+ 1 lies in the ideal (Fo(x),..., Fi, (x)). 


There is a determinantal formula for the resultant Resa,,....a,, (as the de- 
terminant of a matrix involving coefficients of the given polynomials and co- 
efficients of their Bezoutian A;py)) only when dz + ---+dyn < do +d, +n. 
In general, it is possible to find smaller Macaulay formulas than those arising 
from (1.56), as the quotient of the determinants of two such explicit matrices 
(c.f. [Jou97], [WZ94], [DD01]). 

Assume for example that n = 2, (do, di, dz) = (1, 1,2), and let 


Fo = agro + A121 + AQXQ 
F, => boxo 6124 borg 
Fy = ar + Cox? + 3.03 + C4%0X1 + C5XQLQ + CEX1 Ly 


be generic polynomials of respective degrees 1, 1,2. Macaulay’s classical matrix 


looks as follows: 
ao 000 0 C1 


0 ay 0 by 0 C2 
0 0 ag 0 bo C3 
a, ao 0 bo 0 C4 
a2 0 ao 0 bo C5 
0 a2 ai bg by C6 


and its determinant equals —agResj,1,2. In this case, the extraneous factor ao 
is the 1 x 1 minor formed by the element in the fourth row, second column. On 
the other hand, we can exhibit a determinantal formula for +Res; 1,2, given 
by the determinant of 


A\1,0,0) @0 bo 
Avo,1,0) ay by |, 
Avo,0,1) a2 be 


where the coefficients A, of the Bezoutian A: are given by 
A(1,0,0) = €1(a1b2 — a2b1) — c4(aob2 — agbo) + ¢5(aob1 — aibo), 


A(o,1,0) = €6(a0b1 — a1bo) — c2(aob2 — boa2) 
and 
A(o,0,1) = €3(aob1 — boar). 


In fact, in this case the resultant can be also computed as follows. The 
generic space of solutions of the linear system fo = fi = 0 is generated by the 
vector of minors (ayb2 = a2bj, —(aob2 = azbo), ayb2 = azby). Then 
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Res1,1,2(£o, Fi, F2) = F2(a1b2 — a2b1, —(aob2 — a2bo), a1b2 — a2b1). 


Suppose now that Fo = >>}, a2; is a linear form. As in expression (1.54) 
one gets, using the homogeneity of the resultant, that 


n 
dy...d ay 
Re81,d,...,dn (Fo, Fi,..-, Fn) =ao' ”ReSi,dy,...,dn (Lo + ) qj tis Fs: Fn) 
— ag 


i=l 


n n 
d,...dn, ai ai 
= ao! Resd,,...,dn (Fi(- D ae Ui,L1,... sn), Sree Fy(- Ds a6 Bi, L1,---, Xn)). 
i=1 i=l 
More generally, let €o,...,€-—1 be generic linear forms and F,,...,F;, be 
homogeneous polynomials of degree d,,...,d, on the variables x0,..., 2p. 


Write £; = yo aa; and for any subset J of {0,...,n}, |J| =r, denote by 
67 the determinant of the square submatrix Aj := (a‘), gj € J. Obviously, 
oy € Za‘, j € J] vanishes if and only if £ vee bp 0 cannot be 
parametrized by the variables (2;)j¢7. 

Assume for simplicity that J = {0,...,7—1} and let 6; 4 0. Left mul 
tiplying by the inverse matrix of Aj, the equality A.’ = 0 is equivalent 
to xz = k-th coordinate of —(Aj)~*.(a5)j¢7(ar;---;@n)*, for all k € J. 
Call F Hy (@p,---;%n),j = 1,..-.,n, the homogeneous polynomials of degrees 
d;,...,dy, respectively gotten from Fij,7 = r,...,n after this substitution. 
Using standard properties of Chow forms (defined below), we then have 


Proposition 1.6.3. Up to sign, 
R Ciyvrilpi leon Ea) Soy" Res 
€81,...,1jdp,...,dn Ope reper Lit roe est ayy = 8 7 Sd. sited 
In case r = n we moreover have 


Res1 1 dp (0, tee ee Fy) = C4) geome 010.9 sc) aC | (= ti cc): 


As we have already remarked in the univariate case, resultants can, in 
principle, be obtained by a Grébner basis computation using an elimination 
order. However, this is often not feasible in practice, while using geometric 
information contained in the system of equations to build the resultant ma- 
trices may make it possible to obtain the result. These matrices may easily 
become huge (c.f. [DD01] for instance), but they are structured. For some 
recent implementations of resultant computations in Macaulay2 and Maple, 
together with examples and applications, we also refer to [Bus03]. 


The unmixed case 


Assume we have an unmixed system, i.e. all degrees are equal. Call dy = --- = 
dy = d and write Fj(x) = )0),)~q diy27. Then, the coefficients of each Aq are 
linear combinations with integer coefficients of the brackets [y0,..-,Yn] := 
det (ajy,,7,j =0,...,n), for any subset {yo,-.., Yn} of multi-indices of degree 
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d. In fact, in this equal-degree case, if Fo,...,F, and Go,...,Gp are homo- 
geneous polynomials of degree d, and G; = paar miyf;,i = 0,...,n, where 
M= (mi;) E k(e+1)x (n+), then, 


Resg,....a(Go,-..,Gn) = det(M)* Resg,..a(Fo,---, Fn). 


Se, da is invariant under the action of the group 
SL(n,k) of matrices with determinant 1, and by the Fundamental Theorem 
of Invariant Theory, there exists a (non unique) polynomial P in the brackets 
such that Resg,....a(Fo,---, Fn) = P([y0;---;%n],|%| = @). There exists a de- 
terminantal formula in terms of the coefficients of the Bezoutian as in (1.24) 
only ifn = 1 or d=1. In the “simple” case n = 2, d= 2, Reso2.2 is a degree 
12 polynomial with more than 20,000 terms in the 18 coefficients of Fo, F, Fo, 
while it has degree 4 in the 20 brackets with considerably fewer terms. 
Given a projective variety X ¢ P%(K), of dimension n, and n generic 
linear forms ¢),...,€n, the intersection XN (4, = 0)N---N(€n = 0) is finite of 
cardinal equal to the degree of the variety deg(X). If we take instead (n + 1) 
generic linear forms f9,...,n in P’ (IK), the intersection X¢ := XM (l9 = 0)N 
-- (€, = 0) is empty. The Chow form Cx of X is an irreducible polynomial 
in the coefficients of f9,...,2, verifying 


Cx(fo,.--,£n) =O => Xe FO. 


Consider for example the twisted cubic, i.e the curve V defined as the 
closure in P3(KK) of the points parametrized by (1: ¢: ¢? : t8), t € K. It can 
also be presented as 


V = {(60: 61: 2:3) € P(K): : G — fof2 = & — bs = bobs — £1b2 = OF. 


Given a linear form 9) = agéo + a1€1 + agé2 + a3&3 (resp. 6, = boo + b1€1 + 
bo€o + b3€3), a point in V of the form (1: t: ¢? : t3) is annihilated by £5 (resp. 
¢,) if and only if t is a root of the cubic polynomial fo = ao + ait +a2t? +a3t? 
(resp. fi = bo + bit + bot? + b3t?). It follows that 


Cy (C0, 1) = Res3,3(fo, fa). 
In general, given n and d, denote N = ie and consider the Veronese 
variety V,,q in P’~1(IK) defined as the image of the Veronese map 


P"(RK) oa P—1(RK) 


(fo 2+ ttn) > (*)ial=a- 
Given coefficients (ajq,i = 0,...,n,|a| = d), denote by €; = Duiske Qiaka 
and f; = Slalce Giat®, 1 = 0,...,n, the corresponding linear forms in the N 


variables €, and degree d polynomials in the n variables t;. Then, 


San Cee) = Resg,..., a(fo,---;fn)- 


For the use of exterior algebra methods to compute Chow forms, and a fortiori 
unmixed resultants, we refer to [ESW03]. 
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1.6.2 A glimpse of other multivariate resultants 


Resultants behave quite badly with respect to specializations or give no in- 
formation, and so different notions of resultants tailored for special families 
of polynomials are needed, together with appropriate different algebraic com- 
pactifications. 

Suppose we want to define a resultant which describes the existence of a 
common root of three degree 2 polynomials of the form 


fi(a1, £2) = a, 41X24 b; x1 + C; ©2 4 ais a;,6;,c;,d; € K, 7 = 0,1, 2, (1.58) 


i.e. ranging in the subvariety of the degree 2 polynomials with zero coefficients 
in the monomials 27,73. Note that the homogenized polynomials 


2 . 
F (a0, 21, £2) == 4,%1X2 + 29%) + GMO XQ + d; %, t= 0,1, 2, 


vanish at (0,1,0) and (0,0,1) for any choice of coefficients a;, b;,c;, d;. There- 
fore the homogeneous resultant Res2,2,2(fo, fi, fo) is meaningless because it 
is identically zero. Nevertheless, the closure in the 12 dimensional parameter 
space K? with coordinates (ag,...,dz) of the vectors of coefficients for which 
fo, fi, fg have a common root in K?, is an irreducible hypersurface, whose 
equation is the following polynomial with 66 terms: 


Resci,1),(1,1),(1,1) (fo, fi, fa) = —caodjazbo — aiczbodi — aicobzdy + agcidobi 


42agcibscebido — arcabocobide — ancib3do + c2agdzb2 — cza0bjdo + a1cadoaobid2 
+cazd7 bo + 2c9a2b1c1 bod2 — 2czagd b2a,do + a2c;bobedo + a,codoagbod1 + az codobs 
+a2c1doaobed1 — a3c1dobodi +.42¢1doa1bod2 — azc1deb2a1 +c9a2d1b2a1do — a1c2dgb1a2 
+c2a9d1b,a2do9 + c2a9d1a1bod2 — aicodzagb1 —c9a2b1 boced, — azc1 bo becod, — chazbi de 
—aic2boc1bedo + c.a0b1 bod: +.a1c2bec1d2 — aocib2co9b1d2 + a0c1b3cods — 2a1cod2a2bod1 
+a cod2aqbed) —c9a2d7.agb2—agc1d2b2d, — a} codobod2—2a0c1d2b1.a2d9+cpa2d1a9b1d2 


—coarzdybido + cea2by bed, + a1cabebid2 + aoci dzazgbod1 — aocidsa1bo + azcbecedy 


+ca2b}codo+a1cod2b1 az2do+aoc1d2b2a1do +c2a9b7cod2—c2a0b1becodi —coa2b1c1b2do 

—acob2c1bod2+2a1cob2boc2d1 —a2c1b9c2b1do—a1cob2c2b1 do +a1C9b3C1do+a0cj b2bod2 

aoc bebpc2d1 — c2a9b1.c1bod2 — c2a¢d1b1d2 — af{codzb2do + ajcodsbo + aiczbobido 
—a2ci bade + agcidzbr. (1.59) 


This polynomial is called the multihomogeneous resultant (associated to bide- 
grees (1,1)). In Section 1.7 we will describe a method to compute it. 

There are also determinantal formulas to compute this resultant, i.e. for- 
mulas that present Res(1,1),(1,1),(1,1) (fo, f1, fz) as the determinant of a matrix 
whose entries are coefficients of the given polynomials or of an adequate ver- 
sion of their Bezoutian. The smallest such formula gives the resultant as the 
determinant of a 2x 2 matrix, as follows. Given fo, fi, f2, as in (1.58) introduce 
two new variables yi, y2 and let B be the matrix: 
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fo(t1,22) fi(r1, 22) fo(x1, 22) 
B= | folyi,t2) filyi, 22) fo(yi, v2) 
fo(yi,y2) filyi,%2) fo(yr, x2) 
Compute the Bezoutian polynomial 
1 


det(B) = By t By2%9 t Bay t Bo2%2Y1, 

(x1 — y1)(®2 — y2) 
where the coefficients B;; are homogeneous polynomials of degree 3 in the 
coefficients (ao,..., 02) with tridegree (1, 1,1) with respect to the coefficients 
of fo, f; and fg. Moreover, they are brackets in the coefficient vectors; for 
instance, By = C1 boda a bocady = Coby dy + cob, do + bocody — C1 bodo is the 
determinant of the matrix with rows (bo, co, do), (61, C1, d1), (b2, C2, dz). Finally, 


Res1,1),(1,1),(1,1) (fo; f1, fa) = det(Bi;). 


These formulas go back to the pioneering work of Dixon [Dix08]. For a mod- 
ern account of determinantal formulas for multihomogeneous resultants see 
[DE03]. 

Multihomogeneous resultants are special instances of sparse (or toric) re- 
sultants. We refer to 7 for the computation and applications of sparse re- 
sultants. The setting is as follows (cf. [@KZ94, Stu93]). We fix n+ 1 finite 
subsets Ag,..., A, of Z”. To each a € Z” we associate the Laurent monomial 
xt! ...°» and consider consider 


n 
es 3 
f= y Aigt”, t=0,...5.7: 


acA; 


For instance, one could fix lattice polytopes Po,..., P, and take A; = P;NZ". 
In general A; is a subset of the lattice points in its convex hull P;. For generic 
choices of the coefficients ajqg, the polynomials fo,..., fn have no common 
root. We consider then, the closure Hy, of the set of coefficients for which 
fo,.--, fn have a common root in the torus (K\{0})” . If H4 is a hypersurface, 
it is irreducible, and its defining equation, which has integer coefficients (de- 
fined up to sign by the requirement that its content be 1), is called the sparse 
resultant Res4,,....4,,- The hypersurface condition is fulfilled if the family of 
polytopes Po,...,P, is essential, i.e. if for any proper subset I of {0,...,n}, 
the dimension of the Minkowski sum }°,-; P; is at least |J|. In this case, the 
sparse resultant depends on the coefficients of all the polytopes; this is the 
case of the homogeneous resultant. When the codimension of H, is greater 
than 1, the sparse resultant is defined to be the constant 1. For example, set 
n = 4 and consider polynomials of the form 


fo = 4141 + ag%2 + a3x3 + a4%4 + Os 
fi = 6121 + boxe 
fo = C101 + CQ%Q 
fs = b3x3 + baa4 
fa = €3%3 + C44. 
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Then, the existence of a common root in the torus implies the vanishing of both 
determinants b,c2 — bec, and bgc4 — bcs, i.e. the variety Hy, has codimension 
two. In this case, the sparse resultant is defined to be 1 and it does not vanish 
for those vectors of coefficients for which there is a common root. Another 
unexpected example is the following, which corresponds to a non essential 
family. Set n = 2 and let 


fo = a12%1 + agx%2 + a3 
fi = by 21 + boxe 
fa = C121 + CQ%2. 


In this case, the sparse resultant equals the determinant 6;c2 — boc, which 
does not depend on the coefficients of fo. 

There are also arithmetic issues that come into the picture, as in the fol- 
lowing simple example. Set n = 1 and consider two univariate polynomials 
of degree 2 of the form fy = ap + box”, fi = a, + b,2?. In this case, the 
sparse resultant equals the determinant D := agb, — boa. But if we think 
of fo, fi as being degree 2 polynomials with vanishing x-coefficient, and we 
compute its univariate resultant Res2 (fo, f1), the answer is D?. The expo- 
nent 2 is precisely the rank of the quotient of the lattice Z by the lattice 
2Z generated by the exponents in fo, f;. As in the case of the projective 
resultant, there is an associated algebraic compactification X4,,....4, of the 
n-torus, called the toric variety associated to the family of supports, which 
contains (KK \ {0})” as a dense open set. For essential families, the sparse 
resultant vanishes at a vector of coefficients if and only if the closures of 
the hypersurfaces (f; = 0),¢ = 0,...,n, have a common point of intersec- 
tion in X4,,....4,- In the bihomogeneous example (1.58) that we considered, 
A; = {(0,0), (1,0), (0,1),(1,1)} are the vertices of the unit square in the 
plane for 7 = 0,1,2, and the corresponding toric variety is the product variety 
P!(K) x P'(R). 

Sparse resultants are in turn a special case of residual resultants. Roughly 
speaking, we have families of polynomials which generically have some fixed 
common points of intersection, and we want to find the condition under which 
these are the only common roots. Look for instance at the homogeneous case: 
for any choice of positive degrees dg,...,d,, generic polynomials Fo,..., Fy, 
with these degrees will all vanish at the origin 0 € K"*!, and the homoge- 
neous resultant Resay,....d,(Fo,---;fn) is non zero if and only if the origin 
is the only common solution. This problem arises naturally when trying to 
find implicit equations for families of parametric surfaces with base points of 
codimension greater than 1. We refer to Chapter 3 and to [Bus03, BEM03}] 
for more background and applications. 
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1.7 Residues and resultants 


In this section we would like to discuss some of the connections between 
residues and resultants. We will also sketch a method, based on residues, to 
compute multidimensional resultants which, as far as we know, has not been 
made explicit before. 

Suppose P(z), Q(z) are univariate polynomials of respective degrees d1, dz 
as in (1.19) and let Zp = {&1,...,&-} be the zero locus of P. If Q is regular 
on Zp, equivalently Res, a,(P,Q) 4 0, then the global residue resp(1/Q) is 
defined and the result will be a rational function on the coefficients (a,b) of P 
and Q. Thus, it is reasonable to expect that the denominator of this rational 
function (in a minimal expression) will be the resultant. This is the content 
of the following proposition: 


Proposition 1.7.1. For any k = 0,...,di+d2—2, the residue res p(z*/Q) is a 
rational function of the coefficients (a,b) of P,Q, and there exists a polynomial 
Cy € Zia, b] such that 


Ca, b) 


k = 
resp (z"/Q) = Resa, (P,Q) 
Proof. We have from (1.26) that 


Ay Ag 


1= P4 
Resa, do (P, Q) Resq, ,do (P, Q) 


with Ay, Ag € Za, b] [z], deg(A1) = dg — 1, and deg(A2) =d,-1. Then, 


k _ ke As 
resp (z"/Q) = resp € rae) 


and we deduce from Corollary 1.1.7 that there exists a polynomial C/,(a,b) € 
Z{a, b][z] such that 


Q, 


C%,(a, b) 
Resa, do (P, Q) an ; 


resp (z*/Q) = 
Thus, it suffices to show that a divides C{,.(a, b). But, since k < dj +dy—2 
we know from (1.11) that 


Cii(a, b) 
Resi dO)" 


Tres p (z*/Q) = —resg (z*/P) = 


for a suitable polynomial Ci’ € Za, b][z]. Since Resa, a,(P, Q) is irreducible, 
the result follows. 
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Note that according to Theorem 1.5.17, we have 
al a 
res 6(2") = resp (z"/Q) = —resq (z*/P), 


where P,Q denote the homogenization of P and Q, respectively. This is the 
basis for the generalization of Proposition 1.7.1 to the multidimensional case. 
The following is a special case of [CDS98, Th. 1.4]. 
Theorem 1.7.2. Let Fi(z) = )iaqjaa, tiat® € C[zo,..., En], i =0,...,n, be 
homogeneous polynomials of degrees do,...,dn. Then, for any monomial «2° 
with |3| = p = >0,(d; — 1), the homogeneous residue rest p(x") is a rational 
function on the coefficients {aig} which can be written as 
mn C, Aia 
Tes} py (2°) = (dia) 
ReSdp,...,dn (F0,---; Fn) 


for a suitable polynomial Cg € Z[aia]. 


We sketch a proof of this result, based on [Jou97, CDS98] and the notion 
of the determinant of a complex [GKZ94]. 


Proof. We retrieve the notations in (1.56), but we consider now the application 
“at level p” 


Sp—dy I 2182 Sp= ds x So —- Sp 


(Go,.--,G@n, A) t > GoFo J... 4 G,,Fy | Ao, (1.60) 


where Ag is defined in (1.57). For any specialization in K of the coefficients of 
Fo,.--,; Fn (with respective degrees do,..., dn), we get a K-linear map between 
finite dimensional K-vector spaces which is surjective if and only if Fo,..., Fp, 
do not have a common root in K"*t! \ {0}, or equivalently, if and only if 
the resultant Resay,....a,(Fo,---;Fn) is non zero. Moreover, it holds that the 
resultant equals the greatest common divisor of all maximal minors of the 
above map. Let U be the intersection of Zariski open set in the space of 
coefficients a = (a;q) of the given polynomials where all (non identically zero) 
maximal minors do not vanish. For a € U, the specialized K-linear map is 
surjective and for any monomial x° of degree p we can write 


= d, Ala x) Fi(a;x) + A(a) Ao(a; 2), 


where A depends rationally on a. Since the residue vanishes on the first sum 
and takes the value 1 on Ao, we have that 


res( po) (2") = Xa), 


This implies that every maximal minor which is not identically zero must 
involve the last column and that A(a) is unique. Thus, it follows from Cramer’s 
rule that res? (2°) may be written as a rational function with denominator 
M for all non-identically zero maximal minors MM. Consequently it may also 
be written as a rational function with denominator Resay,....a,(Fo,---;Fn)- 
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In fact, (1.60) can be extended to a generically exact complex 
0- Sdg—(n+1) y nae A San —(n41) Soe. = So=dp Re Sp—dn x So Sp -> 0, 


which is a graded piece of the Koszul complex associated to Fo,..., F,, which 

is exact if and only if Resg,...a,(Fo,---; Fn) # 0. Moreover, the resultant 
equals (once we index appropriately the terms and choose monomial bases 
for them) the determinant of the complex. This concept goes back to Cayley 
[Cay48] and generalizes the determinant of a linear map between two vector 
spaces of the same dimension with chosen bases. For short exact sequences 
of finitely dimensional vector spaces V_,, Vo, V; with respective chosen bases, 
the determinant of the based complex is defined as follows [@KZ94, Appendix 
A]. Call d_, and dp the linear maps 


do 


=< 74 3 a, 


and let @; = dim V;, 1 = —1,0,1. Thus, &) = @_, + €,. Denote by M_; and Mo 
the respective matrices of d_, and do in the chosen bases. Choose any subset 
I of {0,...,£0} with |Z| = @_; and let M1, be the submatrix of M_, given 
by all the @_; rows and the @_; columns corresponding to the index set I. 
Similarly, denote by Mé the submatrix of Mo given by the @; rows indexed 
by the complement of J and all the @; columns. Then, it can be easily checked 
that det(M/,) 40 <=> det(Mé) 40. Moreover, up to (an explicit) sign, it 
holds that whenever they are non zero, the quotient of determinants 


det(M!,) 
det (M3) 


is independent of the choice of J. The determinant of the based complex is 
then defined to be this common value. In the case of the complex given by 
a graded piece of the Koszul complex we are considering, the hypotheses of 
[GKZ94, Appendix A, Th. 34] are fulfilled, and its determinant equals the 
greatest common divisor of the rightmost map (1.60) we considered in the 
proof of Theorem 1.7.2. 


We recall that, by b) in Theorem 1.5.16, the graded piece of degree p 
in the graded algebra A = C[%o,...,%n]/(Fo,..-, Fn), is one-dimensional. We 
can exploit this fact together with the relation between residues and resultants 
to propose a new algorithm for the computation of resultants. Given a term 
order <, there will be a unique standard monomial of degree p, the smallest 
monomial 2°, relative to <, not in the ideal (Fp,...,F,). Consequently, for 
any H € C[xo,...,%n]p, its normal form N(#) relative to the reduced Grébner 
basis for <, will be a multiple of x. 

In particular, let A € C[xo,...,@n] be the element of degree p and homo- 
geneous residue 1 constructed in Section 1.5.6. We can write 


P(aia) . Bo : 


MA) = Oia) 
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Theorem 1.7.3. With notation as above, if P(dia), and Q(aia) are relatively 
prime 
ReSdo,...,dn (Fo, saving F,,) = P(aiq). 


Proof. We have: 


a: resp) (A) = LeS{ po) (aes 1%) = P(Gia) Coo (Gia) 


Q(aia) Q(aia) Resa, gaeey dyn (Fo, ry Fi) 
Therefore 
Resa, cba Gis (Fo, tee Fn) Q (aia) = P(dia)C a5 (Gia); 
but since Resg,....,.a, (Fo,---, £n) is irreducible and coprime with Cg, (aia) this 


implies the assertion. 


Remark 1.7.4. Note that Theorem 1.7.3 holds even if the polynomials F; are 
not densely supported as long as the resultant Resg,,....4,(Fo,.--,; Fn) is not 
identically zero. 


Consider the example from Section 1.6.1: 


Fo = a9%o + 4121 + A2%2 
Fy = bo%o + 6121 + boxe 
Fy = C128 ar C2x? alr 3X3 + C4U0XL1 + C5XQL2 + CEL LQ 


Then p = 1 and 


ao ay ag 
A = det bo by bo 
C1 Xo + C4%, +C5%Q Co%1 + CEX2) C3X2 


We can now read off the resultant Res;,1.2(/o, Fi, F2) from the normal form 
of A relative to any Grobner basis of I = (Fo, F, F2). For example computing 
relative to grevlex with xp > 21 > Xo, we have: 


N(A) = ((apbjc3 = agb1bec6 + agbsce + aja, bobece = aga2bc5+ 
a901b1bacs = agayb3c4 + aga2bgb1 C6 = a9a2b1b2c4 => 2a9a1bob1c3 + azbecg = 
az bob2cs + az baci = ayagboce + a1A2b9b1 cs + ay agbob2c4 + 2apagbobecea = 


2a, a2b, boc, + axbece = axbob1c4 + axbic1)/(agbr _ a;bo)) x2 


and the numerator of the coefficient of x2 in this expression is the resultant. 
Its denominator is the subresultant polynomial in the sense of [Cha95], whose 
vanishing is equivalent to the condition x2 € I 

Theorem 1.7.3 is a special case of a more general result which holds in the 
context of toric varieties [CD]. We will not delve into this general setup here 
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but will conclude this section by illustrating this computational method in the 
case of the sparse polynomials described in (1.58). As noted in Section 1.6.2, 
the homogeneous resultant of these three polynomials is identically zero. We 
may however view them as three polynomials with support in the unit square 
P CR? and consider their homogenization relative to P. This is equivalent to 
compactifying the torus (C*)? as P! x P! and considering the natural homog- 
enizations of our polynomials in the homogeneous coordinate ring of P! x P', 
ie. the ring of polynomials C[r1, y1, v2, ya] bigraded by (deg, ,,, deg... y.)- 
We have: 


Fi (21,02, Y1, y2) = @i@1%24+ bey. + ciweyi + diyiye, ai, bi,ci,d; € K. 
These polynomials have the property that 
Fy(A1%1, Ary1, A22, A2y2) = Ai A2 Fi (#1, £2, y1, 2), 


for all non zero Aj, A2. 

Notice that (Fo, Fi, F2) C (a1, 22, y1y2) and we can take as A the deter- 
minant of any matrix that expresses the F in terms of those monomials. For 
example 

aor + boy2 coy do 

A = det a4,X%2Q +7 biy2 C1yV1 dy 

a2%2 + boy2 coy: dz 
We point out that in this case p = (1,1) = 3(1, 1)—(2, 2), which is the bidegree 
of A. If we consider for instance the reverse lexicographic term order with 
yo < Yi ~ LQ < Xj, the least monomial of degree p is yiy2. The normal form 
of A modulo a Grobner basis of the bi-homogeneous ideal (Fo, F\, F2) equals 
a coefficient times yi y2. This coefficient is a rational function of (ao,...,d2) 
whose numerator is the P! x P! resultant of Fo, F,, Fo displayed in (1.59). We 
invite the reader to check that its denominator equals the determinant of the 
3 x 3 square submatrix of the matrix of coefficients of the given polynomials 


ag bo Co 
ay by cy 
az be C2 


Again, this is precisely the subresultant polynomial whose vanishing is equiv- 
alent to y1y2 € (Fo, Fi, F2) (c.f. also [DKk)]). 


As a final remark, we mention briefly the relation between residues, re- 
sultants and rational A-hypergeometric functions in the sense of Gel’fand, 
Kapranov and Zelevinsky [GZK89]. Recall that given a configuration 


A = {aj,...,an} CZ 


or, equivalently an integral p x n matrix A, a function F’, holomorphic in an 
open set U Cc C”, is said to be A-hypergeometric of degree 3 € C? if and only 
if it satisfies the differential equations: 
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O“F —-O’°F=0, 


|u| 
for all u,v € N” such that A-u = A-v, where 0“ = SS and 
y+ Zn” 


- OF 
‘> Oia ED Bi = DF 
j=l 7 


for all i = 1,...,p. The study of A-hypergeometric functions is a very ac- 

tive area of current research with many connections to computational and 

commutative algebra. We refer the reader to [SST00] for a comprehensive 

introduction and restrict ourselves to the discussion of a simple example. 
Let '(d) denote the set of integer points in the m-simplex 


{ue RS: yg < d}. 
j=l 


Let A c Z?™*! be the Cayley configuration 
A= ({eo} x U(d)) U-+-U ({ém} x S(d)). 


Let fi(t) = ae Z(d) Ziat®, i = 0,...,d be an m+ 1-tuple of generic poly- 
nomials supported in X7(d). Denote by F;(xo,...,2%a) the homogenization of 
fi. Given an m + 1-tuple of positive integers a = (ao,...,@m) let (F%) be 
the collection (F5°,...,F%™). The following result is a special case of a more 
general result (see [AS96, CD97, CDS01]) involving the Cayley product of a 
general family of configurations A; C Z™,i=0,...,m. 


Theorem 1.7.5. For any b € N™*! with |b| = dla|—(n+1), the homogeneous 
residue Fe5) ray (a); viewed as a function of the coefficients Xia, 1s a rational 


A-hypergeometric function of degree 3 = (—ao,.--,—-G@m, —b1-1,..., —bm—1). 
Suppose, for example, that m = 2 and d= 1. Then, we have 


111000000 
000111000 
A= |000000111 
010010010 
001001001 


and F (x0, 21, £2) = ajovo + A412, + ajoVo. Let a= (2, 1, 1) and b = (0, 1, 0). 
Then the residue rest pay (#1) might be computed using Algorithm 2 in Sec- 
tion 1.5.6 to obtain the following rational function 


(a20a12 — 19422) / det(aij)?. 


Note that, according to Theorem 1.7.2 and (1.55), the denominator of the 
above expression is the homogeneous resultant 
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Reso11 (FO, Fi, Fo) = Resi11(Fo, Fi, Fe)’. 


Indeed, as 


Ly = ) 1 
Fe F\ Fp = Oaoi1 FoF, F , 


differentiation “under the integral sign” gives the equality 


2 0 1 
P = 
res (roy (21) > Oao\ (say) , 


One can also show that the determinant det(a;;) agrees with the discriminant 
of the configuration A. We should point out that Gel’fand, Kapranov and 
Zelevinsky have shown that the irreducible components of the singular locus 
of the A-hypergeometric system for any degree 3 have as defining equations 
the discriminant of A and of its facial subsets, which in this case correspond 
to all minors of (a;;) . 

In [CDSO01] it is conjectured that essentially all rational A-hypergeometric 
functions whose denominators are a multiple of the A-discriminant arise as 
the toric residues of Cayley configurations. We refer to [CDS02, CD04] for 
further discussion of this conjecture. 
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Summary. This chapter studies algebras obtained as the quotient of a polynomial 
ring by an ideal of finite codimension. These algebras have a rich supply of interesting 
linear maps whose eigenvalues, eigenvectors, and characteristic polynomials can be 
used to solve systems of polynomial equations. We will also discuss applications to 
resultants, factorization, primary decomposition, and Galois theory. 


2.0 Introduction 


This chapter will consider the quotient ring 


A= K{21,..-,¢n]/(fi,---, fs) 


where K is a field and f;,..., f; are polynomials in 71,...,%,, with coefficients 
in K. The ring A is also a vector space over K in a compatible way, so that 
A is an algebra over K. These are the “algebras” in the title of the chapter. 
For us, the most interesting case is when A has finite dimension as a vector 
space over K. We call A a finite commutative algebra when this happens. 


What’s Covered. We will first use the algebra A to determine the solutions 
of the polynomial system 


fil@iy0.+,8n) S++ = f(@1,.:-,; En) = 0. 


We will then use the dual space of A to give an interesting description of 
the ideal (f1,..., fs). In the remaining sections of the chapter, we will learn 
that finite commutative algebras can be used in a variety of other contexts, 
including the following: 


Resultants. 

Factoring over number fields and finite fields. 
Primary decomposition. 

Galois theory. 
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In all of these applications, multiplication maps play a central role. Given a 
finite commutative algebra A, an element a € A gives a multiplication map 


M,:A—<A 


defined by M,(b) = ab for b € A. This is a linear map from a finite-dimensional 
vector space to itself, which means the many tools of linear algebra can be 
brought to bear to study M,. Furthermore, since A is commutative, the linear 
maps M, all commute as we vary a € A. 


What’s Omitted. We will not discuss everything of interest connected with 
finite commutative algebras. The three main topics not covered are: 


e Gorenstein duality. 
e Real solutions. 
e Border bases. 


Duality is covered in Chapter 3 and also in [EM96, EM98], and an introduction 
to real solutions of polynomial equations appears in [CLO98]. Border bases 
are discussed in Chapters 3 (briefly) and 4 (in more detail). 


Notation. Given A = K[a1,...,2n]/(fi,.-., fs) and f € K[z1,..., 2], we 
will use the following notation: 


e [f] € A is the coset of f in the quotient algebra A. 
e Mf is the multiplication map My). Thus M;¢((g]) = [fg] for all [g] € A. 
e My is the matrix of My relative to a chosen basis of A over K. 


Other notation will be introduced as needed. 


2.1 Solving equations 


This section will cover basic material on solving equations using eigenvalues 
and eigenvectors of multiplication maps on finite dimensional algebras. 


2.1.1 The finiteness theorem and Grodbner bases 


Consider a system of polynomial equations 


fi(zi,--+, fn) =0 
fo(x1, ta) =0 
(2.1) 
fs(X1,- aes tn) =0 
in n variables x1,...,%», with coefficients in a field K. Here is an example from 


[MS95] that we will use throughout this section and the next. 
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Example 2.1.1. Consider the equations 


fi=2 + 2y? —2y=0 
fo = xy* — zy =0 (2.2) 
fa=y? -—2y?+y=0 


over the complex numbers C. If we write the the first and third equations as 
fr =a? +2y(y—1)=0 and fs =y(y— 1)? =0, 


then it follows easily that the only solutions are the points (0,0) and (0, 1). 
(Exercise: Prove this.) However, this ignores multiplicities, which as we will 
see are perfectly captured by the algebra A = C[z, y]/(f1, fo, fs). 


Our first major result is the Finiteness Theorem, which gives a necessary 
and sufficient condition for the algebra corresponding to the equations (2.1) 
to be finite-dimensional over K. 


Theorem 2.1.2. The algebra 


A= K{21,...,2n]/(fi,---, fs) 


is finite-dimensional over K if and only if the equations (2.1) have only finitely 
many solutions over the algebraic closure K. 


Proof. We will sketch the main ideas since this result is so important. A 
complete proof can be found in Chapter 5, §3 of [CLO97]. 

First suppose that A is finite-dimensional over K. Then, for each i, the set 
{[1], [x], [2?],...} C A must be linearly dependent, so that there is a nonzero 


polynomial p;(z;) such that [p;(x;)] = [0] in A. This means that 


Pelee S figessi da) 


which easily implies that p; vanishes at all common solutions of (2.1). It follows 
that for each i, the solutions have only finitely many distinct 7th coordinates. 
Hence the number of solutions is finite. 

Going the other way, suppose that there are only finitely many solutions 
over K. Then in particular there are only finitely many ith coordinates, so 
that we can find a nonzero polynomial g;(x;) which vanishes on all solutions 
of (2.1) over K. In this situation, Hilbert’s Nullstellensatz (see Chapter 4, §1 
of [CLO97] for a proof) asserts that 


for some sufficiently large integer NV. 
Now consider the lexicographic order >jex on monomials 7% = af) +--+ a", 
Recall that 2° > x° if ay > bi, or a, = b; and az > by, or... (in other words, 
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the left-most nonzero entry of a— 3 € Z" is positive). This allows us to define 
the leading term of any nonzero polynomial in K[a1,..., 2p]. 

The theory of Grdbner bases (explained in Chapters 2 and 5 of [CLO97]) 
implies that (f,,...,f,) has a Grébner basis g1,...,g, with the following 
properties: 

e gi,---,g form a basis of (fi,..., fs). 
e The leading term of every nonzero element of (fi,..., fs) is divisible by 

the leading term of one of the g;. 

e The set of remainder monomials 


B= {x | x® is not divisible by the leading term of any g; } 
gives cosets [x], x® € B, that form a basis of A over K. 


Since the leading term of p;(;) is a power of x;, the second bullet implies 
that the leading term of some g; is a power of 2;. It follows that in any 7% € B, 
x; must appear to strictly less than this power. Since this is true for all 2, it 
follows that B is finite, so that A is finite-dimensional by the third bullet. 


More details about monomial orders, leading terms, and Grébner bases 
can also be found in Chapter 2 of [CLO97]. 
Let’s apply Theorem 2.1.2 to our example. 


Example 2.1.3. For the equations f; = fo = f3 = 0 of Example 2.1.1, one can 
show that fi, fe, fg form a Grobner basis for lexicographic order with x > y. 
Thus the leading terms of the polynomials in the Grobner basis are 

a”, wy", y”, 
so that the remainder monomials (= monomials not divisible by any of these 
leading terms) are 


B= {l,y,y?,x, ry}. 
(Exercise: Verify this.) Hence A has dimension 5 over C in this case. 


2.1.2 Eigenvalues of multiplication maps 
For the remainder of this section, we will assume that 


A= K{21,..-,2n]/(fi,---, fs) 


is finite-dimensional over K. For simplicity of exposition, we will also assume 
that = 
K=K. 


Thus K will always be algebraically closed. 
As in the introduction, f € K[a1,...,2n] gives a multiplication map 


My: AA. 


Our main result is the following Exgenvalue Theorem first noticed by Lazard 
in 1981 (see [Laz81}). 
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Theorem 2.1.4. Assume that (2.1) has a finite positive number of solutions. 
The eigenvalues of My are the values of f at the solutions of (2.1) over K. 


Proof. We will sketch the proof and refer to Theorem 4.5 of Chapter 2 of 
[CLO98] for the details. 

First suppose A € K is not a value of f at a solution of (2.1). Then the 
equations 


f-A=fi=-:-=f,=0 


have no solutions over K = K, so that by the Nullstellensatz, we can write 


=he(f—d) +o hifi 


i=1 
for some polynomials h,hi,...,hs € K[x1,...,2,]. Since the multiplication 
map M, is the identity 14 and each My, is the zero map, it follows that 
My —Alz= My-y:A— A 


is an isomorphism with inverse M),. Thus A is not an eigenvalue of Mr. 

Going the other way, let p € K” be a solution of (2.1). As in the proof of 
Theorem 2.1.2, pe remainder monomials B = {a*)),..., 2%" } give the ba- 
sis [2%], ..., [2° ™)) of A. The matrix of My relative to this basis is denoted 
My. For 7 = 1,...,m, let p*) be the element of K obtained by evaluating 
x°) at p. Then we claim that 


ag Ss TOE cap)”, (2.3) 


where t denotes transpose. Since 1 € B (otherwise there are no solutions), 
the vector (p*™,...,p%(™)* is nonzero. Thus (2.3) implies that f(p) is an 
eigenvalue of My and hence also of Mf and Mr. 

To prove (2.3), suppose that My = (m,;). This means that 


9) f) = Yo msles 


for j = 1,...,m. Then 2°) f = SW", mize mod (fi,..., fs). Since 
fi,.--, fs all vanish at p, evaluating this congruence at p implies that 


p F(p) = Yom 


for 7 =1,...,m. This easily implies (2.3). 


Example 2.1.5. For the polynomials of Examples 2.1.1 and 2.1.3, the set B = 
{1,y,y7,v, xy} gives a basis of A. One computes that the matrix of M, is 
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Ss 
8 
II 
oroocSo 


The first and second columns are especially easy to see since here M, maps 
basis elements to basis elements. For the third column, one uses fo = xy? — ay 
to show that 
M,([y*]) = [zy7] = [zy]. 

The fourth and fifth columns are obtained similarly. (Exercise: Do this.) Using 
Maple or Mathematica, one finds that the characteristic polynomial of M, is 
CharPolyy, (u) = u°®. By Theorem 2.1.4, it follows that all solutions of the 
equations (2.2) have a-coordinate equal to 0. 

In a similar way, one finds that M, has matrix 


My, = 


oo oOo — © 


with characteristic polynomial CharPolyy,(u) = u?(u —1)°. Thus the y- 
coordinate of a solution of (2.2) is 0 or 1. For later purposes we also note 
that 


M, has minimal polynomial MinPolyy, (u) = u? 


M, has minimal polynomial MinPolyy, (u) = u(u — 1, 


We will see that later that since y takes distinct values 0 and 1 at the solutions, 
the characteristic polynomial u?(u— 1)? of M, tells us the multiplicities of the 
solutions of (2.2). 


In general, the matrix of My : A — A is easy to compute once we have a 
Grobner basis G of (f;,..., f,). This is true because of the following: 


e As we saw in the proof of Theorems 2.1.2 and 2.1.4, G determines the 
remainder monomials B that give a basis of A. 

e Given g € K[z1,...,2,], the division algorithm from Chapter 2, Section 3 
of [CLO97] constructs a normal form 


N(g) € Span(B) 
with the property that g = N(g) mod (fi,..., fs). 


This gives an easy algorithm for computing My with respect to the basis of 
A given by B: For each «* € B, simply compute N(x*f) using the division 
algorithm. This is what we did in Example 2.1.5. 
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If we compute the matrix M,, for each 7, then then Theorem 2.1.4 implies 
that the «;-coordinates of the solutions are given by the eigenvalues of Mz,. But 
how do we put these coordinates together to figure out the actual solutions? 
This was trivial to do in Example 2.1.5. In the general case, one could simply 
try all possible combinations of the coordinates to find the solutions. This is 
very inefficient. We will learn a better method in Section 2.1.3. 


Minimal Polynomials of Multiplication Maps. The minimal polynomial 
of a multiplication map My : A — A has an interesting interpretation. Given 
f € Rlay,...,x,], note that 


My; is the zero map => f € (fi,..., fs). 
Furthermore, given any polynomial P(u) € K[u], we have 
P( Ms) = Mpcf)- 


(Exercise: Prove these two facts.) 

As defined in linear algebra, the minimal polynomial P of My is the monic 
polynomial of minimal degree such that P(M,) is the zero map. Using the 
above two facts, it follows that P is the monic polynomial of minimal degree 
such that P(f) € (fi,..., fs). 

In particular, the minimal polynomial of M;, is the monic polynomial of 
minimal degree such that P(a;) € (fi,..., fs). In other words, P(«;) is the 
generator of the elimination ideal 


(fi, ers bgufe) M K[zi]. 


(Exercise: Prove this.) Thus P(a;) = 0 is the equation obtained by eliminating 
all variables but x; from our original systems of equations (2.1). This gives a 
relation between multiplication maps and elimination theory. 


2.1.3 Eigenvectors of multiplication maps 


A better method for solving equations, first described in [AS88], is to use the 
eigenvectors of My given by (2.3), namely 


MD 5.2 PO SFE ecg) 


In this equation, p is a solution of (2.1), B= {#%,...,2°(™}, and p*) is 
the element of K obtained by evaluating 2°) at p. As we noted in the proof 
of Theorem 2.1.4, (2.3) implies that (p*,...,p%))* is an eigenvector of My 
for the eigenvalue f(p). 

This allows us to use eigenvalues to find solutions as follows. Suppose that 
all eigenspaces of Mj have dimension 1 (we say that Mj is non-derogatory in 
this case). Then suppose that » is an eigenvalue of My with eigenvector 
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v =(u1,---,Um)”- 


By assumption, we know that v is unique up to a scalar. At this point, we 
also know that \ = f(p) for some solution p, but we don’t know what p is. 
To determine p, observe that (p*),..., p%(”))* is also an eigenvalue of My 


for \. Since we may assume that «°) = 1, the first coordinate of this eigen- 
vector is 1. Since \ has a 1-dimensional eigenspace, our computed eigenvector 
v is a scalar multiple of (p*™,...,p%(™)*. Hence, if we rescale v so that its 
first coordinate is 1, then 


v =(1,u2,...,Um)* = (1, p%),..., pe™)*. (2.4) 


The key point is that the monomials 2%) € B include some (and often all) of 
the variables x71,...,2%,. This means that we can read off the corresponding 
coordinates of p from v. Here is an example of how this works. 


Example 2.1.6. Consider the matrices Mj, and M) from Example 2.1.5. Neither 
is non-derogatory since Maple or Mathematica shows that their eigenspaces 
all have dimension 2. However, if we set f = 2x + 3y, then 


t __ opt t 
My = 2M; + 3M, 
is non-derogatory, where 


the eigenvalue 0 has eigenbasis v = (1,0,0,0,0)* 
the eigenvalue 3 has eigenbasis v = (1,1, 1,0,0)*. 


(Exercise: Check this.) Since B = {1,y,y?,x, xy} has the variables x and y 
in the fourth and second positions respectively, it follows from (2.4) that the 
z- and y-coordinates of the solutions are the fourth and second entries of the 
eigenvectors. This gives the solutions 


(0,0) and (0,1) 
found in Example 2.1.1. 


Before using this method in general, we need to answer some questions: 


What happens when some variables are missing from B? 

What does it mean for a matrix to be non-derogatory? 

Can we find f € K[x,,...,x,] such that My is non-derogatory? What 
happens if we can’t? 


The remainder of Section 2.1.3 will be devoted to answering these questions. 


Missing Variables. For a fixed monomial ordering, the ideal (f1,..., fs) has 
a Grobner basis G. We assume that G is reduced, which means the following: 


e The leading coefficient of every g € G is 1. 
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e For any g € G, its non-leading terms are not divisible by the leading terms 
of the remaining polynomials in G. 


As we’ve already explained, G then determines the remainder monomials 
B= {x* | x is not divisible by the leading term of any g € G}. 


We will assume that G # {1}, which implies that 1 € B and that (2.1) has 
solutions in K (the latter is true by the Consistency Algorithm described in 
Chapter 4, §1 of [CLO97] since K = K). 

We need to understand which variables lie in B. We say that x; is known if 
x; € Band missing otherwise (this is not standard terminology). As explained 
above, if My is non-derogatory, then the eigenvectors determine the known 
coordinates of all solutions. It remains to find the missing coordinates. We 
will analyze this using the arguments of [MS95]. 

A variable x; is missing if it is divisible by the leading term of some element 
of G. Since G F {1} and is reduced, it follows that there is some g; € G such 
that 


gi = x; + terms strictly smaller according to the term order. 


Furthermore, since this is true for every missing variable and G is reduced, 
it follows that other terms in the above formula for g; involve only known 
variables (if a missing variable appeared in some term, it would be a missing 
variable x; A x;, so that the term would be divisible by the leading term of 
9; = 2; +--+: € G). Thus 


gi = &; + terms involving only known variables. 


Now let p be a solution of (2.1). Then g;(p) = 0, so that the above analysis 
implies that 


0 = p, + terms involving only known coordinates. 


Hence the g; € G tell us how to find the missing coordinates in terms of the 
known ones. 


Non-Derogatory Matrices. If M is square matrix, then: 


M is non-derogatory <=> the Jordan canonical form of M has 
one Jordan block per eigenvalue 
<=> MinPoly, = CharPolyy. 


(Exercise: Prove this.) This will have a nice consequence in Section 2.1.4 
below. Note also that M is nonderogatory if and only if M* is. 


Existence of Non-Derogatory Multiplication Matrices. Our first ob- 
servation is that there are systems of equations such that My is derogatory for 
all polynomials f € K[a1,..., a]. Here is a simple example. 
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Example 2.1.7. Consider the equations 
ge? =y? =0. 


The only solution is p = (0,0) and B = {1,2,y,xy}. Given f = a+ ba+cy+ 
dx? + exry+--- in K[z,y], we have f(p) = a. One computes that 


fe 
M; = 


so that My — al, has rank < 2 for all f. (Exercise: Prove these assertions.) It 
follows that the eigenspace of M} for the eigenvalue f(p) = a has dimension 
at least 2. Thus My is always derogatory. 


To describe what happens in general, we need to discuss the local structure 
of solutions. A basic result from commutative algebra states that the ideal 
(fi,---,fs) has a primary decomposition. Since we are over algebraically closed 
field and the equations (2.1) have only finitely many solutions, the primary 
decomposition can be written 


(hivowse = Ve (2.5) 


where the intersection is over all solutions p of (2.1) and each I, is defined by 


Ip ={f € K[zi,...,¢n] | gf € (fi,.--, fs) for some g with g(p) 0}. (2.6) 


One can show that J, is a primary ideal, which in this case means that ,/J>p 
is the maximal ideal (#1 — p1,...,2n— Pn), p = (pi,---;Pn). We will explain 
how to compute primary decompositions in Section 2.4.3. See also Chapter 5. 


Example 2.1.8. The ideal of Example 2.1.1 has the primary decomposition 


(x? + 2y? — 2y, cy” — xy, y? — 2y? + y) 
= (a? ,y) ‘a (x? + 2(y = 1), a(y _ I); (y = 1)?) 
= Io) NM Lo,1)- 


We will prove this in Section 2.4.3. 
Given the primary decomposition (f1,..., fs) = (il Ip, we set 
A, = K[z1,...,¢n]/Ip. 


Then (2.5) and the Chinese Remainder Theorem (see p. 245 of [KR00]) give 
an algebra isomorphism 
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A=Kiiy.ital/ Gaon t=] Bei tl/ p= |[ A 2.7) 
P P 
We call A, the local ring of the solution p. This ring reflects the local structure 
of a solution p of (2.1). The multiplicity of a solution p is defined to be 
mult(p) = dimx Ap. 
This definition and (2.7) imply that 
dimg A = # solutions counted with multiplicity. 
We now define a special kind of solution. 


Definition 2.1.9. A solution p of (2.1) is curvilinear if Ap ~ K{z]/(x*) 
for some integer k > 1. 


Over the complex numbers, p is curvilinear if and only if we can find 
local analytic coordinates u1,...,Un at p and an integer k > 1 such that the 
equations are equivalent to 


U1 U2 srignie Un—-1 U 
Alternatively, let m, be the maximal ideal of A,. The integer 
ep = dimg m,/ m? = # minimal generators of m, (2.8) 


is called the embedding dimension of Ap». Then one can prove that a solution 
p is curvilinear if and only if Ap has embedding dimension e, < 1. 


Example 2.1.10. For the solutions (0,0) and (0,1) of the polynomials given 
in Example 2.1.8, we compute their multiplicity and embedding dimension as 
follows. For (0,0), we have 


A(o,o) = K[x, y]/{x*, y) ~ K[x]/(*). 


This shows that (0,0) has multiplicity 2 and embedding dimension 1 (and 
hence is curvilinear). As for (0,1), a Grébner basis computation shows that 


Ao,1) = Kila, yl/(a* + 2(y — 1), e(y — 1), (y —1)) 


has dimension 3, so that the multiplicity is 3. (Exercise: Do this.) The em- 
bedding dimension is less obvious and will be computed in Section 2.2.2. 


Using curvilinear solutions, we can characterize those systems for which 
My (or My or My) is non-derogatory for some f € K[a1,...,2n]- 


Theorem 2.1.11. There exists f € K[x1,...,an] such that My is non- 
derogatory if and only if every solution of (2.1) is curvilinear. Furthermore, 
when the solutions are all curvilinear, then My is non-derogatory when f is a 
generic linear combination of %1,...,2n. 
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Proof. First observe that since there are only finitely many solutions and K is 
infinite (being algebraically closed), a generic choice of a;,...,@, guarantees 
that f = a,x, +---+ an, takes distinct values at the solutions p. 

Next observe that My is compatible with the algebra isomorphism (2.7). 
If we also assume that f takes distinct values at the solutions, then it follows 
that My is non-derogatory if and only if 


My: A, — Ap 


is non-derogatory for every p. 
To prove the theorem, first suppose that My : A, — Ap is non-derogatory 
and let 
u=lf—f(p)] € Ap 


be the element of A, determined by f — f(p). Then the kernel of My_ fp) has 
dimension 1, which implies the following: 


e uw lies in the maximal ideal m, since elements in A, \ m, are invertible. 
e The image of My_ f(p) has codimension 1. 


Since the image is (u) C m, and m, also has codimension 1 in Ap, it follows 
that 
(u) = mp 


Thus p is curvilinear. (Exercise: Supply the details.) 

Conversely, if every p is curvilinear, then it is easy to see that My is non- 
derogatory when f is a generic linear combinations of the variables. (Exercise: 
Complete the proof.) 


Applying the Eigenvector Method. In order to apply the method de- 
scribed here, we need to find f such that My is non-derogatory. Typically, one 
uses f = a,%1+--:+4n2y. The idea is that when the solutions are curvilinear, 
Theorem 2.1.11 implies that f will work for most choices of the a;. 

We implement this as follows. Given a system with finitely many solutions, 
make a random choice of the a; and compute the corresponding My. Then: 


e Test if Mj is non-derogatory by computing whether CharPolyy,(u) equals 
MinPolyy,(u). 

e Once a non-derogatory My is found, use the eigenvector method to find the 
solutions. 


If our system has only curvilinear solutions, then this procedure will probably 
succeed after a small number of attempts. On the other hand, if we make a 
large number of choices of a;, all of which give a non-derogatory My, then we 
are either very unlucky or our system has some non-curvilinear solutions, but 
we don’t know which. 

To overcome this problem, there are two ways to proceed: 
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e First, one can compute the radical 


/(fis-»-, fe) ={f € Klni,..., an] | f* € (f,---,f<) for some k > 1}. 


The radical gives a system of equations with the same solutions as (2.1), 
except that all solutions how have multiplicity 1 and hence are curvilinear. 
Thus Theorem 2.1.11 applies to the radical system. Furthermore, Propo- 
sition 2.7 of Chapter 2 of [CLO98] states that 


(Fiyssey Fe) = (fiss+ +s fos (D111) rad, + <5 (Bn @n) red), 


where p;(x;) is the minimal polynomial of M,, written as a polynomial in 2; 
and (p;(2;))rea is the squarefree polynomial with the same roots as p;(a;). 
See [KR00, Sec. 3.7B] for the squarefree part of a polynomial. 

e Second, one can intersect eigenspaces of the Mj. Let p; be an eigenvalue 
of Mf,,, So that p; is the first coordinate of a solution of (2.1). Then, since 
ME and Mf, commute, Mj, induces a linear map 


Mi. : Ea(p1, Mz, ) as Ea(pi,M;,)- 


where F'4(p1,M;,,) is the eigenspace of Mf, for the eigenvalue p;. The eigen- 
values of this map give the second coordinates of all solutions which have 
pi as their first coordinate. Cones in this way gives all solutions since 
the intersection ()j_, E.a(p;,M%,) is one-dimensional (see Theorem 3.2.3 of 
Chapter 3). This method is analyzed carefully in [MTO1]. 


2.1.4 Single-Variable representation 


One nice property of the non-derogatory case is that when My is non- 
derogatory, we can represent the algebra A using one variable. Here is the 
precise result. 


Proposition 2.1.12. Assume that f € K{m,...,2,] and that My is non- 
derogatory. Then there is an algebra isomorphism 


K[u]/(CharPoly y,,(u)) > A. 


Proof. Consider the map K[u] — A defined by P(u) + [P(f)]. Then P(u) is 
in the kernel if and only if [P(f)] = [0], i-e., if and only if P(f) € (fi,..-., fs). 
In the discussion of minimal polynomials in Section 2.1.2, we showed that the 
minimal polynomial of My is the nonzero polynomial of smallest degree with 
this property. It follows easily that the kernel of this map is generated by the 
the minimal polynomial of My. Thus we get an injective algebra homomor- 
phism 
K[u]/(MinPoly y,, (u)) — A. 


But MinPoly;,(u) = CharPoly y,,(u) since My; is non-derogatory, and we also 
know that 
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dimg K[u] /(CharPoly y,,(u)) = deg CharPoly 7, (u) = dimg A. 
It follows that the above injection is the desired isomorphism. 


Notice that this proof applies over an arbitrary field K. Thus, when K is 
infinite and all of the solutions are curvilinear (e.g., all have multiplicity 1), 
then Proposition 2.1.12 applies when f is a generic linear combination of the 
variables. 

We can use the single-variable representation to give an alternate method 
for finding solutions. The idea is that the isomorphism 


K[u]/(CharPoly y;,, (u)) ~ A 
enables us to express the coset [;] € A as a polynomial in [f], say 


[xi] = Pi([f]), 


where deg(P;) < dimg A. (Exercise: Show that P; can be explicitly computed 
using a Grébner basis for (f,..., fs).) Now we get all solutions as follows. 


Proposition 2.1.13. Assume that My be non-derogatory and let P,,...,Pn 
be constructed as above. Then: 


1. eae oie fs) — (CharPoly y,,(f), 1 _ Pit); vee in — P,(f))- 
2. For any root X of CharPolyy,,(u), the n-tuple 


(Pil Alix ay dal A)) 


is a solution of (2.1), and all solutions of (2.1) arise this way. 
3. If f = TL, aia;, then i, ai(xi—P;(f)) = 0, so that I has n generators. 


Proof. For part 1, it is easy to see that x; — Pf) € (fi,...,fs), and 
CharPoly y7,(f) € (fi,---,fs) by the Cayley-Hamilton theorem. For the other 
inclusion, one uses 7; = x;—P,;(f)+P;(f) to express an element of (f1,..., fs) 
as an element of (x; — Pi(f),...,@n — Pr(f)) plus a polynomial in f. Then 
Proposition 2.1.12 implies that this polynomial is divisible by CharPoly y,, (/). 
From here, the other parts of the proposition follow easily. (Exercise: Complete 
the proof.) 


The third part of Proposition 2.1.13 implies that when all solutions are 
curvilinear, we can rewrite the system as n equations in n unknowns. In this 
case, the corresponding ideal is called a complete intersection. 

The single-variable representation will have some unexpected consequences 
later in the chapter. 
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2.1.5 Generalized eigenspaces and multiplicities 


The final observation of Section 2.1 relates multiplicities and generalized eigen- 
spaces. Given an eigenvalue 4 of a linear map T : V — V, recall that its 
generalized eigenspace is 


Gy(A,T) = {ve V | (T—AD*(v) =0 for some N > 1}. 


It is well-known that the dimension of Gy(A,T) is the multiplicity of \ as a 
root of the characteristic polynomial of T. 


Proposition 2.1.14. The characteristic polynomial of My: A— A is 


CharPoly y7, (u) = [[t _ f(p)) BO), 
Pp 


Furthermore, if f € K[a1,...,@n] takes distinct values at the solutions of (2.1) 
and p is one of the solutions, then the generalized eigenspace G_4(f(p), My) is 
naturally isomorphic to the local ring Ap. 


Proof. First observe that My is compatible with the isomorphism 
Ax |] 452 [Reine /G 
Pp P 


where (f1,..-,fs) = » Jp is the primary decomposition and the product and 
intersection are over all solutions of (2.1). 

Now fix one solution p and note that f(p) is the only eigenvalue of My; 
on A, since A, = K[x1,...,2n]/Ip and p is the only common solution of the 
polynomials in J,,. This easily leads to the desired formula for CharPoly j;, (u) 
since mult(p) = dimg A,. 

The previous paragraph also implies that My_ (p) is nilpotent on Ap, so 
that A, is contained in the generalized eigenspace of My for the eigenvalue 
f(p). If we further assume that f(q) # f(p) for q 4 p, then My_ fp) is 
invertible on Ag. Since A ~ [],, Ap, it follows that we can identify A, with 
the generalized eigenspace G'4(f(p), Mf). 


Here is a familiar example. 


Example 2.1.15. For the equations f; = fo = fz; = 0 of Example 2.1.5, the 
solutions are (0,0) and (0,1), and CharPolyy, (uw) = u?(u — 1)%. Since y sep- 
arates the solutions, Proposition 2.1.14 implies that (0,0) has multiplicity 2 
and (0,1) has multiplicity 3. 


Genericity. Proposition 2.1.14 shows that we can compute multiplicities by 
factoring CharPoly,;,(u), provided f takes distinct values at the solutions. 
Furthermore, if we let f = aja; +---+@n%p, then this is true generically. 
Thus, if we make a random choice of the a;, then with high probability, the 
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resulting f will take distinct values at the solutions. Thus, given a system of 
equations (2.1), we have a probabilistic algorithm for finding both the number 
of solutions and their respective multiplicities. 

But sometimes (in a deterministic algorithm, for instance), one needs a 
certificate that f takes distinct values at the solutions. Here are two ways to 
achieve this: 


e First, if (f1,..., fs) is radical, then f takes distinct values at the solutions 
if and only if 6 = CharPolyy,;, has distinct roots. (Exercise: Prove this.) 
Hence one need only compute gcd(®, &’). 

e In general, one can compute ,/(fi,..., fs) as described in Section 2.1.3 
and then proceed as in the previous bullet. 


Numerical Issues. A serious numerical issue is that it is sometimes hard to 

distinguish between a single solution of multiplicity k > 1 and a cluster of k 

very close solutions of multiplicity 1. Several people, including Hans Stetter, 

are trying to come up with numerically stable methods for understanding such 

clusters. For example: 

e While the individual points in a cluster are not stable, their center of 
gravity is. 

e When the cluster consists of two points, the slope of the line connecting 
them is numerically stable. 


More details can be found in [HS97a]. From a sophisticated point of view, this 
is equivalent to studying the numerical stability points in a Hilbert scheme of 
subschemes of affine space of fixed finite length supported at a fixed point. 


Other Notions of Multiplicity. The multiplicity mult(p) defined in this 
section is sometimes called the geometric multiplicity. There is also a more 
subtle version of multiplicity called the algebraic multiplicity or Hilbert-Samuel 
multiplicity e(p). A discussion of multiplicity can be found in [Cox]. 


2.2 Ideals defined by linear conditions 


As in the previous section, we will assume that 


A= K{21,..-,2n]/(fi,---, fs) 


is a finite-dimensional algebra over an algebraically closed field K. In this 
section, we will continue the theme of linear algebra, focusing now on the role 
of linear functionals on A and K[a1,..., 2p]. 


2.2.1 Duals and dualizing modules 


Given A as above, its dual space is 
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A = Homx(A,K). 
If {€1,...,€m} is a basis of A, then composing with the quotient map 
K[@1,---,2%n] — A gives linear functionals 
Iy,..., Lm : Klz1,...,2n] —> K 
with the property that 


(fi,--->fs) ={f € Klai,...,¢n] | Li(f) = 0,2 =1,...,m} 


Thus the ideal (f1,..., fs) is defined by the linear conditions given by the L;. 
In this section, we will explore some interesting ways of doing this. 
Recall from Section 2.1 that we have the product decomposition 


Ax] Ap 
P 


induced by the primary decomposition (fi,..., fs) = cr Ip, where the product 
and intersection are over all solutions in K” of the equations 


fi=fo=-:-= fs =0. 
The product induces a natural isomorphism 
Ax [i As. (2.9) 
Pp 


One feature of (2.9) is the following. For each solution p, let {£p, Pe be 
a basis of Ap: As above, every £4 gives a linear functional 
Ly 4: K[ri,...,¢%n] —> K. 
Then: 
e If we fix p, then 


Ip = {f € K[ai,...,¢n] | Loi(f) =0 for i =1,...,mult(p)} 


is the primary ideal such that A, = K[a1,...,¢n]/Ip. 
e If we vary over all p and 7, then the L,,;, define the ideal 


CPiycacg de) =e 


This way of thinking of the linear conditions gives not only the ideal but also 
its primary decomposition. 

The dual space A also relates to the matrices My and M appearing in 
Section 2.1. Since My is the matrix of the multiplication map My : A — A 
relative to the basis of A coming from the remainder monomials B, linear 
algebra implies that My is the matrix of the dual map Mj : A A relative to 


the dual basis of A. This has the following nice application. 
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Example 2.2.1. Let p be a solution and let 1, : A — K be the linear functional 
defined by 1,([f]) = f(p). (Exercise: Explain why this is well-defined.) Thus 
ly € A. In Section 2.1.2, we used equation (2.3) to prove that f(p) is an 
eigenvalue of My. In terms of Mj, this equation can be written 


Mj (1p) = f(p) 1p. 
(Exercise: Prove this.) See Theorem 3.2.3 of Chapter 3 for more details. 


We also need to explain how A relates to commutative algebra. The key 
point is that A = Homg(A,K) becomes an A module via 


(af) (b) = &(ab) 


for a,b € A and ¢ € A. Similarly, in the decomposition (2.9), each Ap is 


a module over A,. We call A and A, the dualizing modules of A and A, 
respectively. Dualizing modules are discussed in Chapter 21 of [Fis95]. 
Finally, we say that A is Gorenstein if there is an A-module isomorphism 


Ax A. 


Being Gorenstein is equivalent to the existence of a nondegenerate bilinear 
form 


(,):AxAoK 


with the property that 
(ab, c) = (a,be) 


for a,b,c € A. (Exercise: Prove this.) Here is an example. 


Example 2.2.2. Let P € C[z] be a polynomial of degree d with simple zeros. 
Consider the global residue 


resp : C[z]/(P) —C 
introduced in Chapter 1. By Theorem th:dual of Chapter 1, the bilinear form 
Clz]/(P) x C[z]/(P) > C 
defined by (gi - g2) = resp(gigz2) is nondegenerate. Since 
(9192 - 93) = resp((9192)93) = resp(91(9293)) = (91 - 9293) 
we see that A= C[z]/(P) is Gorenstein. 


See Chapter 21 of [Eis95] for more on Gorenstein duality. 
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2.2.2 Differential conditions defining ideals 


So far, we’ve seen that each primary ideal J, can be described using mult(p) 
linear conditions. We will now explain how to represent these linear conditions 
using constant coefficient differential operators evaluated at p. We will assume 
that K has characteristic 0. 

Let’s begin with some examples. 


Example 2.2.3. The equation 
z(x—1)?=0 


has the solutions 0 of multiplicity 2 and 1 of multiplicity 3. In terms of deriv- 
atives, we have the ideal 


(x?(a —1)°) = {f € K[z] | f(0) = f/) =0, FQ) = f'() = fA) = 9}. 


Notice that the multiplicities correspond to the number of conditions defining 
the ideal at 0 and 1 respectively. 


Example 2.2.4. Consider the three sets of equations 


One easily sees that (0,0) is the only solution, with multiplicity 3 in case (a), 
multiplicity 4 in case (b), and multiplicity 3 in case (c). In terms of partial 
derivatives, the corresponding ideals are given by 


(a): (a?,ay,y’) = {f € K[zx,y] | f(0,0) = fr(0,0) = fy(0,0) = 0} 
(b) : (oa) = ed € K[x, y] | f (0,0) = fx (0, 0) = fy (0, 0) — fey (0,0) = O} 
(c): (x,y°) = {f € K[z,y] | f(0,0) = fy(0,0) = fyy(0, 0) = O}. 


In each case, the multiplicity is equal to the number of conditions defining the 
ideal. 


We will now generalize this description and use it to obtain interesting 
information about the local rings. For instance, in the above example, we will 
see that the descriptions of the ideals in terms of partial derivatives imply the 
following: 


e In cases (b) and (c), the ring is Gorenstein but not in case (a). 
e Incase (c) the ring is curvilinear but not in cases (a) and (b). 
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We begin by setting up some notation. Consider the polynomial ring 
K[0,,...,0n]. Then an exponent vector a = (a1,...,@n) gives the monomial 
oO”, which we regard as the partial derivative 


OUT tan 
0° = =>: 
Oxy! +++ Oxn” 
Thus elements of K[01,...,0n] become constant coefficient differential opera- 
tors on K[a,..., 2p]. 
In examples, we sometimes write 0% as Oz«. Thus 
a3 
— 9(1.2) _ 
On = 0 = Seay 


when operating on K[z, y]. Also note that Example 2.2.4 involves the operators 


(a) : 1, On, Oy 
(b) : 1, Oe, Oy; Oxy (2.10) 
(c) : 1, Oy, Oye 


applied to polynomials in K{z, y] and evaluated at (0,0). Here, 1 is the identity 
operator on K{s, y]. 

We next define the deflation or shift of D = S07, caO® € K[O1,..., On] by 
an exponent vector 3 to be the operator 


ogD = Teal’ Jor. 


where (3) = ae ve Co) and 0°—-8 = 0 whenever a — @ has a negative coor- 


dinate. ‘The reason for the binomial coefficients in the formula for ogD is that 
they give the Leibniz formula 


=U )opD(g 


for f,g € K[x1,...,2»]. Here are some simple examples of deflations. 
Example 2.2.5. Observe that 


Oxy has nonzero deflations Oz,, 0x, Oy, 1 
Oy2 has nonzero deflations 0,2, 20y, 1. 


These correspond to cases (b) and (c) of Example 2.2.4. On the other hand, 


the operators of case (a) are not deflations of a single operator. As we will 
see, this is why case (a) is not Gorenstein. 


Definition 2.2.6. A subspace L C K[01,...,0,] is closed if it has finite 
dimension over KK and closed under deflation, i.e., 03(L) C L for all 6. 
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The reader can easily check that the differential operators in cases (a), (b) 
and (c) of (2.10) span closed subspaces. Here is the main result of Section 2.2.2 
(see [MMM93] for a proof). 


Theorem 2.2.7. For every solution p of (2.1), there is a unique closed sub- 
space Ly C K[O,,...,0,] of dimension mult(p) such that 


(fy... fs) = (f € Klzi,.-.,2n] | D(f)(p) = 0 ¥ solution p and D € Lp}, 


where D(f)(p) means the evaluation of the polynomial D(f) at the point p. 
Furthermore, the primary component of (f,,..., fs) corresponding to a solu- 
tion p is 


1,={f ] Kiri, ...4%,] | Df )@) =0 forall De ZL}, 
and conversely, 
Ly, ={D € K[O1,..., On] | D(f)(p) = 0 for all f € Ip}. 


It should not be surprising that Examples 2.2.3 and 2.2.4 are examples of 
this theorem. Here is a more substantial example. 


Example 2.2.8. Consider the equations 


fi = a7 +2y? —2y=0 
fo=ay’ —xzy=0 
fs=y? —2y? +y =0 


from Example 2.1.1. There, we saw that the only solutions were (0,0) and 
(0,1). In [MS95], it is shown that 


L (0,0) = Span(1, On) 


2.11 
Lvo,1) = Span(1, Oxs 0,2 = ay) ( ) 


Thus (f1, fo, fs) consists of all f € K[x, y] such that 


and looking at the conditions for (0,0) and (0,1) separately gives the primary 
decomposition of (f1, fo, fg). In Section 2.2.3 we will describe how (2.11) was 
computed. 


Gorenstein and Curvilinear Points. We conclude Section 2.2.2 by ex- 
plaining how special properties of the local ring A, can be determined from 
the representation given in Theorem 2.2.7. 


Theorem 2.2.9. A, is Gorenstein if and only if there is D € Ly, whose de- 
flations span Ly. 
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Proof. Let LtY denote the linear forms K[z1,...,2%n] — K obtained by com- 
posing elements of L, with evaluation at p. Each such map vanishes on J, and 


thus gives an element of A). Hence 
lige tL) 2 Ap. 


Furthermore, if D € Ly, maps to De Ap then the Leibniz formula makes it 
easy to see that the deflation ogD maps to (x — p)°D, where 


B_ 


(x — p)? = (a1 — pi)” +++ (an — pn)?” 


for p = (pi,---,;pPn) and B = (bi,...,b,). (Exercise: Prove this.) Hence these 
deflations span L, if and only if A, is generated by a single element as an 
A,-module. In the latter case, we have a surjective A,-module homomorphism 


Ap > As which is an isomorphism since A, and As have the same dimension 
over IK. We are done by the definition of Gorenstein from Section 2.2.1. 


Before stating our next result, we need some definitions from [MS95]. 


Definition 2.2.10. The order of D = >, cq0™ is the degree of D as a 
polynomial in K[O,,...,0n]. A basis Dy,...,Dmutt(p) of Lp is consistently 
ordered if for every r > 1, there is 7 >1 such that 


Span(D € L, | D has order <r) =Span(Di,...,D;). 


Note that every consistently ordered basis has D, = 1. Also observe that 
the bases listed in (2.10) and (2.11) are consistently ordered. 
We can now characterize when Ay, is curvilinear. 


Theorem 2.2.11. The embedding dimension e, of Ap is the number of op- 
erators of order 1 in a consistently ordered basis of Ly. In particular, Ap is 
curvilinear if and only if any such basis has at most one operator of order 1. 


Proof. Let m, be the maximal ideal of A,. Recall from equation (2.8) that 
ep = dim m,/m*, 
= # minimal generators of mp. 


Also let Lt = Span(D € Ly | D has order <r). Then L} Cc Li C--- and, for 
r > 0, we have 


dimg L7/ i = # operators of order r ina (2.12) 
consistently ordered basis of Ly. , 


We claim that there is a natural isomorphism 


L,/L) ~ Homg(m,/m2, R). (2.13) 
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Assuming this for the moment, the first assertion of the theorem follows im- 
mediately from (2.12) for r = 1 and the above formula for e,. Then the final 
assertion follows since by definition A, is curvilinear if and only if it has 
embedding dimension e, < 1. 

To prove (2.13), let IN, = (a1 — pi,.--,2n — Pn) C Klai,...,2n] be the 
maximal ideal of p. Then any operator D = PD a,0; induces the linear map 


mM, — kK 


that sends f € Mt, to D(f)(p). By the product rule, this vanishes if f € es 
so that we get an element of the dual space 


Homx (Mt, /IN2, K). 


Furthermore, it is easy to see that every element of the dual space arises in 
this way. (Ezercise: Prove these assertions.) 
The isomorphism K[x1,...,2%n]/Ip ~ Ap induces exact sequences 


0— I, — Mp, — m, — 0 
and 
0 > Homg(m,/m?,K) + Homg (9, /IM2, K) > Home (Ip/Ip N MZ, K) — 0. 


It follows that D = >", a;0; gives an element of Homx(m,/m;z,K) if and 
only if D vanishes on I, which is equivalent to D € Ly. Since these operators 


represent L/,/L}, we are done. (Exercise: Fill in the details.) 


We get the following corollary of this result and Theorem 2.1.11. 


Corollary 2.2.12. The multiplication map My is non-derogatory when f is 
a generic linear combination of x1,...,% if and only if for every solution p, 
a consistently ordered basis of Ly has at most one operator of order 1. 


Since the bases in Example 2.2.8 are consistently ordered, (2.11) shows that 
the solutions of the corresponding equations have embedding dimension 1 and 
hence are curvilinear. (This is the computation promised in Example 2.1.10.) 
Thus My is non-derogatory when f is a generic linear combination of x, y. Of 
course, we computed a specific instance of this in Example 2.1.6, but now we 
know the systematic reason for our success. 

Note also that if we apply Theorems 2.2.9 and 2.2.11 to Example 2.2.4, 
then we see that the ring is Gorenstein in cases (b) and (c) (but not (a)) and 
curvilinear in case (c) (but not (a) and (b)). This proves the claims made in 
the two bullets from the discussion following Example 2.2.4 
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2.2.3 Two algorithms 


We’ve seen that the ideal (f1,..., f;) can be described using Grébner bases 
and using conditions on partial derivatives. As we will now explain, going from 
one description to the other is a simple matter of linear algebra. 


Grobner Bases to Partial Derivatives. If we have a Grobner basis for 
(fi,---,fs), then we obtain the required closed subspaces L, in a three-step 
process. The first step is to compute the primary decomposition 


fist) =f h- 
P 


In particular, this means knowing a Grobner basis for each J,. We will explain 
how to compute such a primary decomposition in Section 2.4.3. 

Given this, we fix a primary ideal J,,. We next recall a useful fact which re- 
lates I, to the maximal ideal IN, = (a1—pi,.--,%n—Pn) of pin K[x,..., rn]. 


Lemma 2.2.13. [fm = mult(p), then IN C Ip. 


Proof. It suffices to prove that my = {0}, where m, is the maximal ideal of A,. 
By Nakayama’s lemma, we know that that ms x a whenever my # {0}. 
Using 


Ap D> Mp D m3 D--- D mE D {0}, 
it follows that dimg A, > k + 1 whenever me # {0}. The lemma follows. 


This lemma will enable us to describe J, in terms of differential operators 
of order at most m. However, this description works best when p = 0. So the 
second step is to translate so that p= 0. Hence for the rest of our discussion, 
we will assume that p = 0. Thus Lemma 2.2.13 tells us that 


My’ Clo, m= mult(0). 


The third step is to write down the differential operators in Lo as follows. 
Let Bo be the set of remainder monomials for the Grébner basis of Jp and set 


Mon, = {a* | ° ¢ Bo, deg(x™) < m}. 
For each z* € Monn, let 


c= » Cape” mod Ip (2.14) 
thE Bo 


In other words, ))a¢ Bo Capt? is the remainder of «* on division by the 
Grébner basis of Ip. Then, for each x? € Bo, define 


! 
Dg = OP + S- Cap Ld. 
a 


where a! = a,!--+a,! for a = (a1,...,@n) and similarly for /!. 
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Proposition 2.2.14. f € K[m,...,2,] lies in I, if and only if Dg(f)(0) = 0 
for all 2° € Bo. 


Proof. Let f = 3°, Gee. Since MG’ C Ip, we can assume that f = 


deg(a)<m Get. Using (2.14), it is straightforward to show that 


felyp = agt+ »- Co6do = 0 for all 7° € Bo. 


x*E€Mon» 
(Exercise: Prove this equivalence.) However, since 


8" (a°)(0) = fi om (2.15) 


0 otherwise, 


one easily sees that for 7° € Bo, 
! 
Da(f)(0) = (0" + Pte Mary, CaB £0*) ( Soest on ax”) (0) 
= B\(ag a yee Wiis Copa) ; 
The proposition now follows immediately since K has characteristic 0. 
Here is an example of this result. 


Example 2.2.15. For the polynomials from Example 2.2.8, we will show in 
Section 2.4.3 that the primary decomposition is 


(x? + 2y? — 2y, cy” — xy, y® — 2y? + y) 
= Too) 1 1,1): 


Let’s focus on I(o,1). If we translate this to the origin, then we get the ideal 


I 


Ip = (a? + 2y, zy, y”). 


The generators are a Grobner basis for lex order with x > y, the remainder 
monomials are Bo = {1,z,y}, and the multiplicity is m = 3. Thus 


Mong = {x° | x° ¢ {1,2,y}, deg(x*) < 3} = {x, xy, y?}. 
The coefficients cag are given by 
xg? =0-14+0-2+4 (—-2)-ymod Ip 
zy =0-1+0-x2+0-y mod Ig 
y? =0-14+0-2+0-ymod Jp. 


so that 
Di, =1, Dy=0z, Dy = Oy + (—2) #02 = Oy — Op2. 


(Exercise: Check this!) Up to a sign, this is the basis of L(o,1) that appeared 
in Example 2.2.8. The treatment for L(o,9) is similar. (Exercise: Do it.) 
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We also want to remark on an alternate way to view the construction 
Lo = Span(Dg | x® € Bo). Here, we are in the situation where p = 0, so that 
by Theorem 2.2.7, we have 


Io = {f € K[a1,..., an] | Da(f)(0) = 0 for all x? € Bo} 


and 
Lo = {D € K[O1,..-,9n] | Da(f)(0) = 0 for all f € Ip}. 


Now we will do something audacious: switch x; with 0;. This means that Io 
becomes an ideal , 
Io C KlQ1,.--, On] 


and Lo becomes a subspace 
Eo C K[ai,...,2n). 
Observe that the pairing (2.15) is unchanged under x; < O;. Thus: 
Lo = {the polynomial solutions of the infinitely 

many differential operators in Take 
This is the point of view taken in Chapter 10 of [Stu02]. Here is an example. 
Example 2.2.16. In Example 2.2.15, we showed that 

Ip = (x? + 2y, zy, y?) => Lo = Span(1,0,;, Oy — 0,2). 

This means that under the switch x Oz, y < Oy, the subspace 

Lo = Span(1, x, y — x”) C K[z, y] 


is the space of all polynomial solutions of the infinitely many operators in the 
ideal . 
Ip = (0? + 20y, O¢0y, 02) C Kz, Oy]. 


yoy 


Other examples can be found in [Stu02)]. 


We should also note that the description of J, given by differential condi- 
tions can require a lot of space. Examples and more efficient methods can be 
found in Section 3.3 of [MMM96]. 


Partial Derivatives to Gr6ébner Bases. Now suppose that conversely, we 
are given the data of Theorem 2.2.7. This means that for each solution p we 
have a closed subspace L, of dimension mult(p) such that 


(fi,---.fs) ={f € Klmi,...,¢n] | D(f)(p) = 0 for all p and D € Lp}. 


If we pick a basis D,; of each L,, then the linear forms f ++ Dy, i(f)(p) give 
a linear map 
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L: K[ri,...,%] —> K” (2.16) 


where m = dimg A. This map is surjective and its kernel is (f1,..., fs). Given 
an order > (with some restrictions to be noted below), our goal is to find a 
Grobner basis of (f1,..., fs) with respect to > using the linear map (2.16). 

The idea is to simultaneously build up the Grobner basis G and the set of 
remainder monomials B. We begin with both lists being empty and feed in 
monomials one at a time, beginning with 1. The main loop of the algorithm 
is described as follows. 


Main Loop: Given a monomial x%, compute L(x”) together with L(x*) for 
all 2° € B. 


e If L(x) is linearly dependent on the L(x®), then compute a linear relation 


L(a*) = 55 agl(x*), age F 
a2 EB 


(hence ° — 7 scp ax" € (fi,...,fs)) and add x*— 7 scp apr" to G. 
e If L(x%) is kinearly independent from the L(x*), then add «° to B. 


Once this loop is done for 7%, we feed in the next monomial, which is the 
minimal element (with respect to >) of the set 


N(a2*,G) = {monomials 27 > 2% such that «7 is not (2.17) 
divisible by the leading term of any g € G}. 


Hence we need to find the minimal element of N(a*,G). As explained in 
[BW93], this is easy to do whenever > is a lex or total degree order. The 
algorithm terminates when (2.17) becomes empty. 

In [MMM93], it shown that this algorithm always terminates and that 
when this happens, G' is the desired Grobner basis and B is the corresponding 
set of remainder monomials. Here is an example. 


Example 2.2.17. In the notation of Example 2.2.15, let p = (0,0) and Lo = 
Span(1,0,,0, — 0,2). It follows that Jp is the kernel of the map 


L: K[z,y] — K°® 


defined by L(f) = (f(0,0), f(0,0), fy(0,0) — frx(0,0)). If we use lex order 
with x > y, then the above algorithm starts with B = G = @ and proceeds as 
follows: 


x? | L(x) B G min(N(2%, G)) 

r | (1,0,0) | {1} 0 y 

y* |(0,0,1) | {ly} | 0 y? 

y? | (0,0,0) | {ly} | {y?} x (2.18) 
a |(0,1,0). | {lye} | 197} ry 

xy |(0,0,0) | {l,y,2} | {y?, xy} a? 

x? | (0,0,—2) | {1,y,2} | {y?, ry, 2? + 2y} | none! 
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In this table, an asterisk denotes those monomials which become remainder 
monomials. The other monomials are leading terms of the Grobner basis. 
(Exercise: Check the steps of the algorithm to see how it works.) 


A complexity analysis of this algorithm can be found in [MMM93]. 


2.2.4 Ideals of points and basis conversion 


We conclude by observing that the algorithm illustrated in Example 2.2.17 
applies to many situations besides partial derivatives. The key point is that if 


L: K[ri,...,%,] —> K™ 


is any surjective linear map whose kernel is an ideal J, then the algorithm 
described in the discussion following (2.16) gives a Grobner basis for IJ. Here 
are two situations where this is useful. 


Ideals of Points. Suppose we have a finite list of points p1,...,Dm € K”. 
Then we want to compute a Gr6ébner basis of the ideal 


I={f €Klxy,...,¢,] | f(pi) =--: = fam) = 0} 


consisting of all polynomials which vanish at p),...,)m. This is now easy, for 
the points give the linear map L : K[a1,...,2n] > K™ defined by 


L(f) — (f(p1), tee , f(Pm)) 


whose kernel is the ideal J. Furthermore, it is easy to see that L is surjective 
(see the proof of Theorem 2.10 of Chapter 2 of [CLO98]). Thus we can find a 
Grobner basis of J using the above algorithm. 


Example 2.2.18. Consider the points (0,0), (1,0), (0,1) € K?. This gives the 
linear map L : K{x, y] ~ K® defined by 


L(f) = (F(0, 0), f1,0), F(0, 1)). 


If you apply the algorithm for lex order with x > y as in Example 2.2.17, 
you will obtain a table remarkably similar to (2.18), except that the Grobner 
basis will be {y? — y, xy, x? — x}. (Exercise: Do this computation.) 


A more complete treatment appears in [MMM93]. This is also related to 
the Buchberger-Moller algorithm introduced in [BM82]. The harder problem 
of computing the homogeneous ideal of a finite set of points in projective space 
is discussed in [ABKR00]. 


Basis Conversion. Suppose that we have a Grobner basis G’ for (f1,..., fs) 
with respect to one order >’ and want to find a Grébner basis G with respect 
to a second order >. 
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We can do this as follows. Let B’ be the set of remainder monomials with 
respect to G’. Then taking the remainder on division by G’ gives a linear map 


L: K[xi,...,2%n] —> Span(B’) ~ K™. 


The kernel is (fi,..., fs) and the map is surjective since L(x?) = x? for 
x® € B’. Then we can apply the above method to find the desired Grébner 
basis G. This is the FGLM basis conversion algorithm of [FGLM93]. 


Example 2.2.19. By Example 2.2.17, {y?, xy, x? + 2y} is a Grobner basis with 
respect to lex order with 7 > y. To convert to a lex order Grobner basis with 
y > x, we apply the above method. After constructing a a table similar to 
(2.18), we obtain the Grébner basis {x3,y + 527}. (Exercise: Do this.) 


Besides ideals of points and basis conversion, this algorithm has other 
interesting applications. See [MMM93] for details. 


2.3 Resultants 


The method for solving equations discussed in Section 2.1 assumed that we 
had a Grobner basis available. In this section, we will see that when our equa- 
tions have more structure, we can often compute the multiplication matrices 
directly, without using a Grobner basis. This will lead to a method for solving 
equations closely related to the theory of resultants. 


2.3.1 Solving equations 


We will work in K[21,...,2,], K algebraically closed, but we will now assume 
that we have n equations in n unknowns, L.e., 
fila@i,---;%n) =+++ = fn(@1,---,%n) = 0. (2.19) 


Solutions to such a system are described by Bézout’s theorem. 


Theorem 2.3.1. Consider a system of equations (2.19) as above and let d; 
be the degree of f;. Then: 


1. If the system has only finitely many solutions, then the total number of 
solutions (counted with multiplicity) is at most w= dy --- dy. 

2. If fi,.--;fn are generic, then there are precisely uw = d,---dy, solutions, 
all of multiplicity 1. 


To find the solutions, we will use a method first described by Auzinger and 
Stetter [AS88]. The idea is to construct a 4 x uw matrix whose eigenvectors 
will determine the solutions. For this purpose, let 


d=dy+-:-+dy—n+tl1 (2.20) 
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and divide the monomials of degree < d into n + 1 disjoint sets as follows: 
Sn = {x7 : deg(x7) < d, x% divides x7} 
Sn—1 = {x7 : deg(x7) < d, x2" doesn’t divide «7 but rae does} 
So = {27 : deg(x7) < d, x4",..., 2% don’t divide x7}. 
Note that 
So = {at ..- 2 |0 <b; < d;—1 for all ¢}, so that #Sp =p. (2.21) 


(Exercise: Prove this.) Since So plays a special role, we will use x* to denote 
elements of So and x? to denote elements of S; U---US,,. Now observe that 


if c~ € So, then x has degree < d—1, 
if 2° € S;,i > 0, then a? fot has degree < d— dj, 


where the first assertion uses d— 1 = d,+---+dn—n=0¥_,(d; — 1). 
Now let fo = a12%1 +--+: +4y%p, a; € K, and consider the equations: 


x fo =0 for all x € So 
(ec? /e®) fy =0 for alla® € Sy 


(a? /xt) f, =0 for all 2° € Sp. 


Since the x° fo and 2° / eh fi; have degree < d, we can write these polynomials 
as linear combinations of the «% and «9. We will order these monomials so that 
the elements z% € So come first, followed by the elements 7° € S;U---USh. 
This gives a square matrix Mo such that 


ro a fo 
or go fo 

Mo | ao | =| po joe p, | > (2.22) 
whe aPa fat fi 


where, in the column on the left, the first two elements of So and the first 
two elements of 5; are listed explicitly. The situation is similar for the column 
on the right. Each entry of Mo is either 0 or a coefficient of some f;. In the 
literature, Mo is called a Syluvester-type matriz. 

We next partition Mp so that the rows and columns of Mo corresponding 
to elements of So lie in the upper left hand corner. This gives 
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— (Moo Moi 
Mo = ie aes (2.23) 


where Moo is a ys x ps matrix for w = d,---d,, and Mj, is also a square matrix. 
One can show that Mj, is invertible for a generic choice of f;,..., f,. Hence 
we can define the ps x 44 matrix 


Mf) = Moo — Mor My’ Mio. (2.24) 


Also, given a point p € K”, let p® be the column vector (p™, p®,...)* ob- 
tained by evaluating all monomials in So at p (where t means transpose). 


Theorem 2.3.2. Let fi,..., fn be generic polynomials, where f; has total de- 
gree d;, and construct My, as in (2.24) with fo = ayay + +++ + An%p. Then 
p® is an eigenvector of My, with eigenvalue fo(p) whenever p is a solution of 
(2.19). Furthermore, the vectors p® are linearly independent as p ranges over 
all solutions of (2.19). 


Proof. Let p® be the column vector (p*',p®,...)* given by evaluating all 
monomials in S$; U-:-US, at p. Then evaluating (2.22) at a solution p of 


(2.19) gives 
w= (0). 


In terms of (2.23), this becomes 


(in) (pe) = GP): 


and it follows that 


Mf P° = fo(p) p®. (2.25) 
(Exercise: Prove this.) Hence, for a solution p, fo(p) is an eigenvalue of My, 
with eigenvector p*. For generic a1,...,@n, fo = @1%, +++: + An%y takes 


distinct values at the solutions, i.e., the eigenvalues fo(p) are distinct. This 
shows that the corresponding eigenvectors p® are linearly independent. 


We can now solve (2.19) by the method of Section 2.1.3. We know that 
there are 4 = d; ---d,, solutions p. Furthermore, the values fo(p) are distinct 
for a generic choice of fp = a,%1 +++: +@n%y. Then Theorem 2.3.2 implies 
that the px ps matrix M fo has fs eigenvectors p*. Hence all of the eigenspaces 
must have dimension 1, i.e., My, is non-derogatory. 

Also notice that 1 € So by (2.21). It follows that we can assume that every 


“ is of the form 


p 
p° = (1, p%@), S28 ,pre))®, 


Thus, once we compute an eigenvector v of M,, for the eigenvalue fo(p), we 


know how to rescale v so that v = p%. 
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As in Section 2.1.3, the idea is to read off the solution p from the entries 
of the eigenvector p®. If f; has degree d; > 1, then x; € So, so that p; appears 
as a coordinate of p*. Hence we can recover all coordinates of p; except for 
those corresponding to equations with d; = 1. These were called the “missing 
variables” in Section 2.1.3. In this situation, the missing variables correspond 
to linear equations. Since we can find the coordinates of the solution p for 
all of the other variables, we simply substitute these known values into the 
linear equations corresponding to the missing variables. Hence we find all 
coordinates of the solution by linear algebra. Details of this procedure are 
described in Exercise 5 of Section 3.6 of [CLO98}. 

This is all very nice but seems to ignore the quotient algebra 


A= K{21,..-,2n]/(fi,---, fn)- 
In fact, what we did above has a deep relation to A as follows. 


Theorem 2.3.3. If fi,..., fn are generic polynomials, where f; has total de- 
gree d;, then the cosets of the monomials 


So = {a?! --- 0° |0 <b; <d;—1 for all i} 


form a basis of the quotient algebra A. Furthermore, if fp = a1%1 +--+ +an@n 
and My, is the matrix constructed in (2.24) using fo, fi,.--,fn, then 


4 t 
My. = My, 5 


where My, is the matrix of the multiplication map My, : A — A relative to the 
basis given by So. 


Proof. Recall from Bézout’s theorem that when fi,..., fn are generic, the 
equations (2.19) have js = d; ---d,, solutions of multiplicity 1 in K”. It follows 
that A has dimension yz over K. Since this is also the cardinality of So, the first 
part of the theorem will follow once we show that the cosets of the monomials 
in So are linearly independent. 

Write the elements of Sp as c°,...,2%) and suppose we have a linear 
relation among the cosets [z°)], say 


ey[2°)] +--+ + ey [xe] = 0. 
Evaluating this equation at a solution p makes sense and implies that 
cype) Se c,.pe) = 0. (2.26) 


In the generic case, our equations have 4 = d,---d,, solutions, so that (2.26) 
gives 4 equations in ss unknowns c1,...,¢,. But the coefficients of the rows 
give the transposes of the vectors p%, which are linearly independent by The- 
orem 2.3.2 It follows that c; = --- = ¢, = 0. This proves that the cosets 
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[ee]... , [2°] are linearly independent. Thus So gives a basis of A as 
claimed. 
For the second assertion, observe from equation (2.3) that 


Mj, P* = fo(p) p® 


for each solution p. Comparing this to (2.25), we get 
Mj, P® = Mj, P® 


for all solutions p. Since fi,..., fn are generic, we have yz solutions p, and the 
corresponding eigenvectors p® are linearly independent by Theorem 2.3.2. 
This implies Mj, = My,. 


The above proof of Mj, = My, requires that det(Mi,) 4 0 and that all 
solutions have multiplicity 1. In Chapter 3, Theorem 3.5.1 will show that 
Mi, = My, holds under the weaker hypothesis that det(M,,) 4 0. (Note that 
the matrix Mp defined in Section 3.5.1 is the transpose of our Mo. Thus the 
“Schur complement” defined in Theorem 3.5.1 is what we call Mi.) 

It is satisfying to see how the method described in this section relates to 
what we did in Section 2.1. However, there is a lot more going on here. Here 
are some items of interest. 


Multiplication Matrices. By setting fo = x; in Theorem 2.3.3, we can 
construct the matrix of multiplication by x; as Mz, = Mt. However, it is 
possible to compute all of these maps simultaneously by using fo = ui21 + 
+++ + Un@n, Where u1,...,Un are variables. In the decomposition (2.23), the 
matrices Mj 9 and M,, don’t involve the coefficients of fo. Thus, we can still 
form the matrix My, from (2.24), and it is easy to see that 

Mj 


(0) 


= uMy, +--+ + UpMz,- 


Thus one computation gives all of the multiplication matrices M;,. See the 
discussion following Theorem 3.5.1 in Chapter 3 for an example. 


Solving via Multivariate Factorization. As above, suppose that fo = 


uyty +++: + Untn, Where u1,...,Un, are variables. In this case, det (My, ) be- 
comes a polynomial in F'lui,...,Un]. The results of this section imply that 
for fi,..., fn generic, the eigenvalues of My, are fo(p) as p ranges over all 


solutions of (2.19). Since all of the eigenspaces have dimension 1, we obtain 


Pp 
It follows that if we can factor det (M fo) into irreducibles in F'[ui,..., Un], then 


we get all solutions of (2.19). The general problem of multivariate factorization 
over an algebraically closed field will be discussed in Chapter 9. We will see 
in Section 2.3.2 that (2.27) is closely related to resultants. 
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Ideal Membership. Given f € K[a,...,¢n], how do we tell if f € 
(fi,---,fn)? This is the Ideal Membership Problem. How do we do this with- 
out a Grébner basis? One method (probably not very efficient) uses the above 
matrices M,, as follows: 


fe (fi,---3 fn) <= > f(Me,,---,Ma,) is the zero matrix. 


To prove this criterion, note that f(Mz,,...,Mz,) = Mp since the M,, commute. 
Then we are done since Myf is the zero matrix if and only if f is in the ideal. 
(Exercise: Supply the details.) 


Sparse Polynomials. It is also possible to develop a sparse version of the 
solution method described in this section. The idea is that one fixes in ad- 
vance the terms which appear in each f; and then considers what happens 
when fj; is generic relative to these terms. One gets results similar to Theo- 
rems 2.3.2 and 2.3.3, and there are also nice relations to polyhedral geometry. 
This material is discussed in Chapter 7. See also [CLO98, Chapter 7]. 


Duality. The assumption that f1,..., fn, have only finitely many solutions in 
K” implies that these polynomials form a regular sequence. This allows us to 
apply the duality theory of complete intersections. There are also interesting 
relations with the multidimensional residues discussed in Chapters 1 and 3. 
This material is also covered in greater detail in [EM96] and [EM98]. 


2.3.2 Multivariate resultants 


ae d, in an irreducible polynomial 
in the coefficients of n + 1 homogeneous polynomials 


Fo,..., fn € K[zo,-.--,2n] 
of degrees dop,...,d,, with the property that 
ReSdo,...,dn (Fo, asSe Fn) = 0 


if and only if the F; have a common solution in the projective space P”(K) 
(as usual, K = K). 

This resultant has an affine version as follows. If we dehomogenize F; by 
setting zo = 1, then we get polynomials f; € K[a1,..., a] of degree at most 
d;. Since F; and f; have the same coefficients, we can write the resultant as 


eS a5.. 2h | Soya das 


Then the vanishing of this resultant means that the system of equations 
fo = -:: = fn = 0 has a solution either in K” or “at 00,” i.e., a projec- 
tive solution with xp = 0. 
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In the situation of Section 2.3.1, we have n polynomials f),..., fn of de- 
grees dj,...,dy in 2,...,%n. To compute a resultant, we need one more 
polynomial. Not surprisingly, we will use 


fo = ayr1 free Hf Antn- 


We will usually assume a; € K, though (as illustrated at the end of Sec- 
tion 2.3.1) it is sometimes useful to replace a; with a variable u;. 


In order to compute the resultant Resi .a,,....a,(fo, f1;---; fn), we need to 
study the behavior of the system f; =--- = fn = 0 at oo. Write 
di 
f=) fis 
j=0 
where f;,; is homogeneous of degree j in 71,...,%,. Then f; homogenizes to 
d; 
B= fas 
j=0 
of degree d; in %0,21,..-,2%n. Then (2.19) has a solution at oo when the 


homogenized system 
Fy — n= 0 


has a nontrivial solution with zp = 0. 
The following result relates the algebra A = K[a1,...,2n]/(fi,---; fn) to 
solutions at oo. 


Lemma 2.3.4. The following are equivalent: 


fi =--- = fn =0 has no solutions at oo 


=> Resy, cd. (Sia ss Fa) FO 
<> A has dimension u = d,---d, over K. 


Proof. Note that F; reduces to fi,a, when x9 = 0. Thus the f; have a solution 
at co if and only if the system of homogeneous equations 


fia =-:: = fra, = 9 


has a nontrivial solution. This gives the first equivalence. The second uses 
Bézout’s theorem and some facts from algebraic geometry. See Section 3 of 
Chapter 3 of [CLO98] for the details. 


When there are no solutions at oo, it follows that we get our algebra A of 
dimension d,---d, over K. But unlike Section 2.3.1, the solutions may have 
multiplicities > 1. In this case, we can relate resultants and multiplication 
maps as follows. 
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Theorem 2.3.5. If fo = uitit-::+Un&n and (2.19) has no solutions at oo, 
then 


Res1,a,, gees ae dig. rgd) 
= Resa, prety et d\> a9 5rd) det (Mf, ) 


= Resa, _a,(fiar, ebay Fnidm ) [en Se eyed: igh, oe, 
Pp 


and 


Re81.d,,...,dn (u — fo; fi, ane , fn) 
=> Resa, ,...,dp (fia: saey Trids) CharPoly,,, (u) 


mult 
= Resay,...dy(fisdis--+ Fran) | | (u— (tip +++ + UnPn)) an 
Pp 


Proof. In each case, the first equality uses Theorem 3.4 of [CLO98, Ch. 3] and 
the second uses Proposition 2.1.14 of this chapter. 


While this is nice (and will have some unexpected consequences when we 
discuss Galois theory in Section 2.5), the relation between resultants and what 
we did in Section 2.3.1 goes much deeper. Here are some details. 


Computing Resultants. First recall that the method given in Section 2.3.1 
for computing the matrix My, of the multiplication map My, used the equality 


v7 t 
Myo = My 


from Theorem 2.3.3. As we noted after the proof, this equality requires that 
det(Mi1) # 0 (since det(Mj1)~! was used in the formula for My, given in 
(2.24)). This relates to resultants as follows. 

A standard method for computing Res, 4, ,....a, (fo, f1,---+ fn) involves the 
quotient of two determinants. In our situation, he relevant formula is 


det(Mo) = Resi,d,,...,dn (fo, f1,»++5 fn) det(Mp), (2.28) 


where Mp is precisely the matrix appearing in (2.22) and Mj is the submatrix 
described in Section 4 of Chapter 3 of [CLO98]. It follows that 


det(Mo) 


Resi ,d,,..., stn Fo» f+ ++ dn) = Geena 
0 


whenever det(Mj) 4 0. The subtle point is that det(Mo) and det(M) can 
both vanish even though Resi a,,....a, (fo, f1,---, fn) is nonzero. So to calculate 
the resultant using Mo, we definitely need det(Mj) 4 0. Yet for My,, we need 
det(M1) 4 0. Here is the nice relation between these determinants. 


Proposition 2.3.6. det(M1) = Resa, ,....4, (fi,a,,--+> fn.a,, ) det(M5). 
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Proof. First observe that by (2.22) and the definition of My,, we have 


= I —MyM,;' Moo Moi 
det(Mo) = det € 7 det ner a 


_ My, O\) Jie 
= det G2 iy = det (My, ) det(Mi1). 


whenever det(MM 1) 4 0. Using this with (2.28) and Theorems 2.3.5 and 2.3.3, 
we obtain 


det (My, ) det(My1) = det(Mp) 
= Resid acti, Sou Tigxseg dn) det My) 


= Resa,,...dn (fi, dies Fridn ) det (My, ) det (Mj) 
= Resa ,..dy (fistrs +++ fnydn) det (My,) det(Mo) 
when fi,..., fn are sufficiently generic. Cancelling det(M,,) (which is nonzero 


generically) shows that the equality 


gees gn (fisas: heey Fridy,) det (Mo) 


holds generically. Since each side is a polynomial in the coefficients of the fj, 
this equality must hold unconditionally. 


det(Mi1) = Resa, 


_ We noted earlier that det(Mi1) 4 0 implies that My, is defined and satisfies 
Mrpo = Mi, Then Proposition 2.3.6 shows that det(M1) 4 0 also guarantees 
the following additional facts: 


e Resq,,....dn (fi,d,;-++3fn,d,) #9, 80 that (2.19) has no solutions at oo. 
e det (Mi) # 0, so that Res a,,....a,(fo, fi,---,fn) can be computed using 
Mp and Mi. 


Hence the link between Section 2.3.1 and resultants is very strong. 
For experts, we observe that (2.28) and Proposition 2.3.6 imply that if 
det(Mj) 4 0, then we have 


det Mo 
Resi ,dj,...dn (fo, fis---> fn) = det Mi 
0 
det My, 
ReSqy,...,dn(fi,dis++ +> frjdn) = det M/’ 
0 


where we observe that Mj is a submatrix of both My; and Mo. So Mp allows 
us to compute not one but two resultants. Has this been noticed before? 


Genericity. In Section 2.3.1, we required that f1,..., fn be “generic”, which 
upon careful reading means first, that the system (2.19) has d; ---d, solutions 
of multiplicity 1, and second, that det(M,) 4 0. In terms of resultants, this 
means the following: 
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e Resd,,...dn (finds; sae Se 1 Indu) # 0. 
e Resa—1,dh,...,d, (det (nets ...;fn) #0, where d is defined in (2.20). 
e det(Mj) 4 0. 


The first item guarantees that A has the correct dimension by Lemma 2.3.4 
and the second guarantees that the Jacobian is nonvanishing at all solutions, 
so that every solution has multiplicity 1 by the implicit function theorem. 
Finally, the first and third conditions are equivalent to det(Mi1) 4 0 by 
Proposition 2.3.6. 

One historical remark is that while the formula (2.28) is due to Macaulay 
in 1902, many ideas of Section 2.3.2 are present in the work of Kronecker in 
1882. For example, Kronecker defines 


gee) oe fivtigswta te) 


and shows that as a polynomial in wu, its roots are fo(p) for p a solution of 
(2.19). He also notes that the discriminant condition of the second bullet is 
needed to get solutions of multiplicity 1 and that when this is true, the ideal 
(fi,---; fn) is radical (see pp. 276 and 330 of [Kro31, Vol. IT]). 


2.4 Factoring 


In earlier sections of this chapter, we studied quotient algebras 


A= Kits. stall fasten ihe) 


where K was usually algebraically closed. Here, we will work over more general 
fields and consider two factorization problems: factoring polynomials into a 
product of irreducible polynomials and factoring an ideal into an intersection 
of primary ideals (primary decomposition). We will also study the Theorem 
of the Primitive Element. 


2.4.1 Factoring over number fields 


Near the end of Section 2.3.1, we gave the formula (2.27) 


det (My,) = [[@n +++++UnPn), 
P 


where fo = ujyx1+---+Un2y and the product is over all solutions p. The point 
was that we could compute the left-hand side, so that if we knew how to factor 
multivariable polynomials over an algebraically closed field, then we could find 
all of the solutions. We will now turn the tables and use finite commutative 
algebras and their multiplication maps to do factoring over number fields. We 
begin with a lovely result of Dedekind. 
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Dedekind Reciprocity. Suppose that f(x), g(x) € Q|2] are irreducible with 
roots a, 3 € C such that f(a) = g(3) = 0. Then factor f(a) into irreducibles 
over Q((), say 


f(a) = fil@)--+ fr(a), fila) € QUA) Ia]. (2.29) 


The f;(x) are distinct (i.e., none is a constant multiple of any of the others) 
since f is separable. Then the Dedekind Reciprocity Theorem describes the 
factorization of g(x) over Q(a) as follows. 


Theorem 2.4.1. Given the above factorization of f(x) into irreducibles over 
Q(), the factorization of g(x) into irreducibles over Q(a) can be written as 


g(x) = n(2)---gr(2),  gi(x) € Qa) [a] 


where 


deg(fi) _ deg(f2) ___ _ deg(fr) _ deg(f) 
deg(g1)  deg(gz) deg(gr)  deg(g) 


Proof. Consider the Q-algebra 


A= Qe, 9] /(F(@), 9(y))- 


Since y + G induces Q[y]/(g(y)) ~ Q(B) and the f;(x) are distinct irreducibles 
(because f(x) is separable), we get algebra isomorphisms 


A = Q(A)[2]/(f(#)) 
~ QA) [2]/(fi(@) «++ fr(@) 


~ [[@) [2]/(fi(x)), 
i=1 K; 


(2.30) 


where K; is a field since f;(x) is irreducible over Q(3). Thus A is isomorphic to 
the product of the fields K,,...,K,. Furthermore, since [K; :Q(3)] = deg(f;), 
the degree of K; over Q is 


[Ki : Q] = [Ki : Q(8)][Q(4) : Q] = deg( fi) deg(g). (2.31) 


Now interchange the roles of f and g. The factorization of g(y) into s 
irreducibles g;(y) over Q(a) ~ Q|a]/(f(x)) gives an isomorphism between A 
and a product of s fields A ~ [J}_, Ki such that 


[K; : Q] = deg(gi) deg(f). (2.32) 


However, the decomposition of A into a product of fields is unique up to 
isomorphism (this is proved by studying the idempotents of A). Hence we 
must have r = s and K; ~ Ki after a suitable permutation of indices. It 
follows that (2.31) and (2.32) must equal for all i, and the result follows. 


102 D.A. Cox 


According to [Edw04], Dedekind discovered this result in 1855, though 
his version wasn’t published until 1982. Kronecker found this theorem inde- 
pendently and stated it in his university lectures. Theorem 2.4.1 was first 
published by Kneser in 1887. 


A Factorization Algorithm. The algebra A that we used in the proof of 
Theorem 2.4.1 can also be used to construct the factorization of f(a) over 
Q(G). The idea is to compute 


P(u) = CharPoly y,,, (u), fo=atty, 


for an appropriately chosen t € Q. This polynomial in Q[u] can be computed 
using the methods of Section 2.1 and factored using known algorithms for fac- 
toring polynomials in Q[u]. Kronecker observed that these factors determine 
the factorization of f(a) over Q(@). Here is his result. 


Theorem 2.4.2. Assume that fo = x+ty takes distinct values at the solutions 
of f() = g(y) =0 and let 


P(u) = II @;(u) 


be the irreducible factorization of ®(u) in Qu]. Then the irreducible factor- 
ization of f(x) over Q(B) is 


f(a) =cfila)--- r(x), 


where c € K* and 
fila) = ged(@i(a + t8), f()). 
(Note that the gcd is computed in Q()[z].) 


Proof. If n = deg(f(x)) and m = deg(g(y)), then Bézout’s theorem implies 
that the equations f(x) = g(y) = 0 have at most nm solutions in x and y 
counted with multiplicity. But in fact there are exactly nm solutions since 
f(a) and g(y) are separable. It follows that all of the multiplicities are 1. We 
now apply the methods of Section 2.1. 

By assumption, fo = x +ty takes distinct values at all solutions of f(x) = 
g(y) = 0. Since they have multiplicity 1, the multiplication map M;,: A— A 
is non-derogatory, where A = Q[z,y|/(f(x),g(y)). Then the single-variable 
representation (Proposition 2.1.12) implies that u +> [a + ty] € A induces 


Qlul/(P(u)) = A 


since ®(w) is the characteristic polynomial of multiplication by fo = «+ty on 
the algebra A. Notice also that 


Disc(®(u)) 4 0 
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since the eigenvalues all have multiplicity 1. This implies that the above fac- 
torization of ®(u) is a product of distinct irreducibles. 

By the Chinese Remainder Theorem, such a factorization gives a decom- 
position 


A = Qlu ~ Tow 


into a product of fields. Using the definition of A, this transforms into the 
product of fields given by 


A= Qsx,y]/(9(y), F(@)) = II Qe, vl/(g(y), F(a), Pie + ty)). (2.88) 


(Exercise: Prove this.) Since y + ( induces Qly]/(g(y)) ~ Q(B), we can 
rewrite (2.33) as a product of fields 


Aw 10) (x +18). (2.34) 


However, Q(/3)[z] is a PID, so that (f(x), ®;(a + t3)) is the principal ideal 
generated by f;(x) = gcd( f (x), ®;(x + t@)). Since each factor in the product 
in (2.34) is a field, we see that f;(x) is irreducible over Q(@). Furthermore, 
each f;(a) divides f(x). It remains to show that f(x) equals f(a) --- f(a) up 
to a constant. 

We proved above that &(u) has distinct roots, so that the same is true for 
@(x + t3). It follows that in the factorization &(a + t8) = [Ti_, &i(a + t), 
the factors ®;(2+#@) have distinct roots as we vary 7. Hence the same is true 
for the f;(x). Hence their product divides f(x) since each one does. However, 
using (2.34) and degree calculations to those of Theorem 2.4.1, we see that 


dimy A = } 7 deg(fi) deg(g). 


i=l 


(Exercise: Supply the details.) Since dimg A = nm = deg(f) deg(g), we see 
that deg(f) = deg(fi--- f,), and the theorem follows. 


This theorem leads to the following algorithm for factoring f(a) over Q(): 


e Pick arandom t € Q and compute &(u) = CharPoly y/,. (uw) for fo = a+ty. 
Also compute Disc(®(w)). 

e If Disc(@(u)) #0, then factor (u) = [];_, &;(u) into irreducibles in Q[u] 
and for each 7 compute gced(@;(a + t8), f(x)) in Q(@)[z]. This gives the 
desired factorization. 

e If Disc(®(u)) = 0, then pick a new t € Q and return to the first bullet. 
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Since Disc(®(u)) 4 0 if and only if «+ty takes distinct values at the solutions 
of f(a) = g(y) = 0, Theorem 2.4.2 implies that the second bullet correctly 
computes the required factorization when the discriminant is nonzero. Notice 
that the second bullet uses the Euclidean algorithm in Q(()[z], which can be 
done constructively using the representation Q(G) ~ Q[y]/(g(y)). 

As for the third bullet, the number of t € Q that satisfy the equation 
Disc(@(u)) = 0 is bounded above by $nm(nm — 1), where n = deg(f(z)), 
m = deg(g(y)). (Exercise: Prove this.) Thus the third bullet can occur at most 
snm(nm — 1) times. It follows that the above algorithm is deterministic. 

An alternate approach would be to follow what Kronecker does on pages 
258-259 of [Kro31, Vol. II] and regard t as a variable in fo = x + ty. Then 
®(u) becomes a polynomial @(u,t) € Q[z,t]. If one can factors ®(u,t) into 
irreducibles Q[u, t], say ®(u,t) = [];_, &:(u,t), then it is straightforward to 
recover f;(x, 3) from &;(a +¢@,t). A rigorously constructive version of this is 
described in [Edw04]. 


2.4.2 Finite fields and primitive elements 


Here, we will give two further applications of finite commutative algebras. 


Factoring over Finite Fields. We begin with a brief description for factor- 
ing a polynomial f(x) € F,[x], where F, is a finite field with q = p* elements. 
We will use the algebra 

A = Fylx]/(f(@)) 


and the Frobenius map 
Frob: A— A, Frob(a) = a’. 


This map is linear over F, and has 1 as an eigenvalue since 14 = 1. As in Sec- 
tion 2.1, E.4(Frob, 1) denotes the corresponding eigenspace. This eigenspace 
determines whether or not f(a) is irreducible as follows. 


Proposition 2.4.3. If f(x) has no multiple roots, t.e., gcd( f(x), f’(x)) = 1, 
then the dimension of the eigenspace E_4(Frob, 1) is the number of irreducible 


factors of f(a). 


Proof. Since f(a) has no multiple roots, a factorization f(x) = fi(x)--- f(a) 
into irreducible polynomials in F,[z] gives an algebra isomorphism 


Aw [Tk:. Ky = F,[2]/(fi(2)), 


compatible with the Frobenius map Frob. If a € K;, then since K; is a field, 
we have the equivalences 


Frob(a) = a at =a a € Fy. 
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It follows that on K;, the eigenvalue 1 has a 1-dimensional eigenspace 
Ex,(Frob,1). Since the eigenspace E\4(Frob,1) is the direct sum of the 
Ex,(Frob, 1), the result follows. 


Here is a simple example of this result. 


Example 2.4.4. Let f(z) = 2° + 24 +1 € Fo[z]. One easily sees that f(z) is 
separable. Then A = F2[z]/(f(x)) is a vector space over Fz of dimension 5 
with basis [1], [x], [x7], []°, [v*], which for simplicity we write as 1, 2, x7, 2°, 2+. 

Note that Frob : A — A is the squaring map since gq = 2. To compute the 
matrix of Frob, we apply Frob to each basis element and represent the result 


in terms of the basis: 


1lreil 


rH «x 


22H x4 


ePee=al+r+e2 


eere=altertert¢ae?+ct. 


4 


Here, 2° = 1+ 2+<a* means that 1+ 2+<2% is the remainder of x® on division 
by f(z) = 2° + 2+ +1, and similarly for the last line. Hence the matrix of 
Frob — 1, is 


10011 10000 00011 
00011 01000 01011 
01001] —]00100} =;01101 
00001 00010 00011 
00111 00001 00110 


(remember that we are in characteristic 2). This matrix has rank 3 since the 
first column is zero and the sum of the last three columns is zero. (Exercise: 
Check the rank carefully.) Hence we have two linearly independent eigenvectors 
for the eigenvalue 1. By Proposition 2.4.3, f(x) is not irreducible over Fo. 


Besides giving the number of irreducible factors of f(a), one can also use 
the eigenspace F4(Frob, 1) to construct the irreducible factorization of f(x). 
The rough idea is that if [h(a)] € A is a nonzero element of E.4(Frob, 1), 
then gced(h(x), f(x)) is a factor of f and (if h(a) chosen correctly) is actually 
one of the irreducible factors of f(a). This is Berlekamp’s algorithm, which is 
described in Section 4.1 of [LN83]. 


Theorem of the Primitive Element. The single-variable representation 
used in the proof of Theorem 2.4.2 may remind the reader of the Theorem of 
the Primitive Element. As we will now show, this is no accident. 
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Theorem 2.4.5. Let K CL = K(qy,...,Q@n) be an extension such that K is 
infinite and each a; is separable over K. Then there are ty ...,tn € K such 
that 

L=K(a), a=tyayt-::+tran. 


Proof. Let f; be the minimal polynomial of a; over K and let 
A= RK[e1,-.-,2n]/(fil@1),--++ fn(@n))- 


Note that we use a separate variable x; for each polynomial f;. Then arguing 
as in the proof of Theorem 2.4.2, one easily sees that by Bézout’s theorem, all 
solutions of 


fi(@1) fo(x2) iad fn(Ln) =0 (2.35) 


have multiplicity 1. Since K is infinite, we can pick t;,...,t, € K such that 
fo =t141+--++tn@, takes distinct values at all solutions of (2.35). It follows 
that My, : A— A is non-derogatory, so that by Proposition 2.1.12, the map 
ut (tia, +--+ +tyx,] € A induces a surjection 


K[u] — A. 


Furthermore, the map A — L induced by x; + a; is surjective since L = 
K(a1,...,@n). The theorem follows since the composition K|[u] — A— L is 
surjective and maps u to a= t1a, +---+ tran. 


Here is an example to illustrate the role of separability. 


v 


Example 2.4.6. Let K = F,(t,u), where t and wu are variables, and let K C 
be the field obtained by adjoining the the pth roots of t and u. This extension 
is purely inseparable of degree p? and L # K(a) for all a € L since a € L 
implies that a? € K. (Exercise: Prove these assertions carefully.) 

Hence the single-variable representation of Proposition 2.1.12 must fail. 
To see the underlying geometric reason for this failure, first observe that 


L= Rix, yl/(2? — ty? — u) 


is the algebra from the proof of Theorem 2.4.5. In the algebraic closure K of 
K, the only solution of 
x? —-t=y?—-u=0 


is given by « = V/t and y = ¥/u. The local ring at this point is 
R[x, y]/(2? — ty? — u) = K[x, yl/((@ — V2)”, (y — Yu)?) ~ Kl, y]/(2?, y?), 


which clearly has embedding dimension 2 and hence is not curvilinear. It 
follows that My is derogatory for all f € K[{x,y]. Since the single-variable 
representation requires that My be non-derogatory, we can see why the The- 
orem of the Primitive Element fails in this case. 
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2.4.3 Primary decomposition 


The final task of Section 2.4 is to extend the factorizations introduced in 
Section 2.4.1 to the realm of ideals. Suppose that 


fi(@i,..-, Un) =+++ = fs(21,.--,2%n) =0 (2.36) 


is a system of equations with coefficients in a field K and only finitely many 
solutions over the algebraic closure K. (Thus we are back in the situation where 
the number of equations need not equal the number of variables.) We say that 
(fi,---,fs) is zero-dimensional since a finite set of points has dimension 0. 
Our goal is to give an algorithm for computing the primary decomposition of 
a zero-dimensional ideal. 


Theoretical Results. An ideal J C K[21,...,2n] is primary if fg € I always 
implies that either f € J or gN € I for some N > 1. It is easy to see that the 
radical VT of a primary ideal is prime. By Chapter 4, §7 of [CLO97], every 
ideal J C K[a1,...,a,] has a primary decomposition 


T=hn- 0], (2.37) 


into an intersection of primary ideals. We say that (2.37) is minimal when r 
is as small as possible. 

In the zero-dimensional case, the primary components I; of (f1,..., fs) 
can be obtained from the given ideal by adding one more carefully chosen 
polynomial u;. Here is the precise result. 


Lemma 2.4.7. A zero-dimensional ideal (fi,..., fs) has a minimal primary 
decomposition 


CHix<e24da) =hn::-O1, 
such that /I,..., WI, are distinct maximal ideals. Furthermore, for each i, 


E¢U Vi, 


Fi 
and any u; € I; \ Uj4i 1; has the property that 


I; = (fi,-- +5 fs, Wi): 


Proof. Let (fi,...,fs) = M---A I, be a minimal primary decomposition. 
Note that J; and hence VJ; are zero-dimensional since (fi,..., fs) is. We 
also know that \/J; is prime. But zero-dimensional prime ideals are maximal. 
(Exercise: Prove this.) Hence the 7; are maximal. Furthermore, if /J; = 
JI; for some i # j, then I; NJ; is primary (Exercise: Supply a proof.) This 
contradicts the minimality of our representation. Hence the \/J; are distinct. 

If Li C Ujus JIj, then I; C \/I; for some j 4 i by the Prime Avoid- 
ance Theorem ((Sha90, Th. 3.61]). This implies that V7; C \/J; and hence 
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VI; = \/I; since the radicals are maximal. This contradiction proves that 
Li £ jes VI: 

Now let u; € J; \ Uj +i \/1;. Then we certainly have (f1,..., fs, wi) C Ii. 
For the opposite inclusion, take 7 # 7 and note that u; ¢ \/J; implies that 


1L+ur; ee for some r; since ,/J; is maximal. (Exercise: Prove this.) Thus 
(1+ wir; I; for some N; > 1. Tepaneine the product 


[[G + er)” € [4 GC (\h: 
j#i j#i ea 
we see that 1+ wir €(),4; 1; for some r. Now take a € J;. Then 
a(1 + ur) EeiN(\h= CPiodscte)e 
j#i 


Hence a = a(1 + wr) + ui(—ar) € (fi,..., fs) + (ui) = (fi,---, fs, Wi), as 
desired. 


In the zero-dimensional case, one can also prove that the ideals J; in the 
primary decomposition are unique. For general ideals, uniqueness need not 
hold (see Exercise 6 of Chapter 4, §7 of [CLO97] for an example) due to the 
phenomenon of embedded components. 

The most commonly used algorithm for computing the primary decompo- 
sition of a zero-dimensional ideal is described in [GTZ88] and uses Grobner 
bases plus a change of coordinates to find the u; of Lemma 2.4.7. However, 
the recent paper [Mon02] of C. Monico shows how to find the u; using the 
quotient algebra 


A= K[x1, tee Xn] /(fi, a6 side) 
We will describe Monico’s method, beginning with the following special case. 


The Rational Case. The solutions of (2.36) are rational over K if all so- 
if 

lutions in K actually lie in K”. In this situation, it is easy to see that the 

primary decomposition is 


itigessa day =[ Vo 


where the intersection is over all solutions p of (2.36). Furthermore, as we 
noted in (2.6), the primary component J, is 


h=1f € Klri,...,¢a] | of © GFi,-+-s fe) So € Klxy,..., ey] with g(p) + Of, 


and af Dp is the maximal ideal (21 — p1,...,%n — Pn) when p = (pi,..-,Pn). 
Unfortunately, this elegant description of J, is not useful for computational 
purposes. But we can use the methods of Section 2.1 to find the polynomials 
u; of Lemma 2.4.7 as follows. 
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Proposition 2.4.8. Suppose that (fi,..., fs) is zero-dimensional and all so- 
lutions of (2.36) are rational over K. If f € K[ai,...,@p] takes distinct values 
at the solutions of (2.36), then for each solution p, the corresponding primary 
component ts 


Ip = (fi, wees dss (f - f(p)ym@)), 


Proof. Let uy = (f — f(p))™"*®). By Lemma 2.4.7, it suffices to show that 
Up € Ip and uy € ./Iq for all solutions q # p. Since \/Tq is the maximal ideal 
of q, the latter condition is equivalent to the non-vanishing of u, at qg, which 
follows since f takes distinct values at the solutions. 
To prove that up € Ip, let v = [[,4,(f — f(q))7™""*@. By Proposi- 
tion 2.1.14, 
UpUp = CharPoly y,, (f). (2.38) 


However, the Cayley-Hamilton theorem tells us that CharPoly ,,, (My) is the 
zero operator on A. Applied to [1] € A, we obtain 


(0] = CharPoly y,, (My)[1] = CharPoly,,, ([f]) = [CharPoly y,(f)]. 
Combined with (2.38), this implies 


Upp = CharPoly y,,(f) B Figs venga) dps 


Since J, is primary, either up, or some power of vp lies in J,. But 


vp(p) = |] (F@) - f(@)"© 40 
q#P 


since f takes distinct values at the solutions. Hence no power of v, lies in I,, 
so that up € Ip. 


Here is an example of this proposition. 


Example 2.4.9. Consider the ideal (x? + 2y? — 2y, ry? — xy, y? — 2y? + y) Cc 
Q|z, y]. We saw in Example 2.1.1 that the corresponding equations have solu- 
tions (0,0) and (0,1), which are rational over Q. Since y takes distinct values 
at the solutions, we can use f = y in Proposition 2.4.8 to compute the primary 
decomposition. 

By Example 2.1.5, the characteristic polynomial of m, is u?(u — 1)%. It 
follows that the primary components are 


I(o,0) = (x? + 2y? — 2y, xy” — ay, y® — 2y? + yy”) = (x?,y) 
To) = (2? + 2y? — 2y, zy? — zy, y® — 2y” + y, (y — 1)%) 
= (x? + Ay —1),2(y— 1), (y—1)”). 


(Exercise: Verify the final equality using the congruences 


y(y — 1)? = (y—1)? mod (y—1)? and = y(y—1) = y— 1 mod (y— 1)?.) 
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Putting these together, we obtain the primary decomposition 


(x? + Qy? — 2y, cy” — xy, y® — 2y? + y) 
= (x7, y)(\(a? + 2(y — 1), 2(y — 1), (y — 1)”) 
= To) 1 1,1) 


given in Example 2.2.15. 


We note that in Proposition 2.4.8, one can replace the characteristic poly- 
nomial with the minimal polynomial. Here is the precise result. 


Proposition 2.4.10. Suppose that (f,,..., f.) is zero-dimensional and all so- 
lutions of (2.36) are rational over K. If f € Kia1,...,@p] takes distinct values 
at the solutions of (2.36), then for each solution p, the corresponding primary 
component is 


Ty = (fay---y fos f — F))"™), 
where MinPoly y,(u) = [],(u— f(p))r™. 
(Exercise: Prove this proposition.) Here is an example. 
Example 2.4.11. For the ideal of Example 2.4.9, recall from Example 2.1.5 
that the minimal polynomial of y is u(u — 1)?. Thus 


T(o,0) = (x? + 2y? — 2y, cy? — zy, y® — 2y? + y,y) = (2”,y) 
Toa) = (a? + 2y? — 2y, zy? — xy, y® — 2y? + y, (y — 1)’) 


= (x? + 2(y — 1), 2(y — 1), (y—1)”). 
This gives the same primary decomposition as Example 2.4.9, though the 


initial description of the primary components is simpler because the minimal 
polynomial has smaller exponents than the characteristic polynomial. 


The General Case. Now suppose that K is a field and that the equations 
(2.36) have solutions whose coordinates may lie in a strictly larger field. This 
means that in the primary decomposition over K, the number of primary 
components no longer equals the number of solutions. Here is an example 
taken from [Mon02]. 


Example 2.4.12. The equations 7? — 2 = y? — 2 = 0 have four solutions 
(+V2,+V2), none of which is rational over Q. We will see below that the 
primary decomposition of (x? — 2, y? — 2) C Q[z, y] is 


(go? —2,y7 -2=Hhonbh = (2* -2,¢-y)N (ae? -—2,0+y). 


Note that the ideal I; corresponds to +(/2, V2) while Iz corresponds to 


+(/2, —V3). 
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Here is a description of the primary decomposition of an arbitrary zero- 
dimensional ideal. 


Proposition 2.4.13. Suppose that (fi,..., fs) is zero-dimensional and f € 
K[z1,...,@n] takes distinct values at the solutions of f; =--- = fs = 0. If the 
irreducible factorization of CharPoly y,,(u) és 


CharPoly y,,(u) = [J pi(u)™, 
i=1 


where pi(u),...,p,r(u) are distinct monic irreducible polynomials, then the 
primary decomposition of (fi,..., fs) 18 given by 


(fi,---5fs) =LA-:-AL,, 


where 


I; = (fi. voy dar Da Ty). 


Proof. We will use Galois theory to prove the proposition in the special case 
when K is perfect (see [Mon02] for the general case). This means that either 
K has characteristic zero, or K has characteristic p > 0 and every element of 
K is a pth power. Every finite extension of a perfect field is separable. 

If J; is a primary component of (f1,..., fs), then its radical /J; is prime 
in K[21,...,2,]. Then the following are true: 


e The variety V(Ii) = V(Vii) C K" is irreducible over K. 
e The Galois group Gal(K/K) acts on V(J;). 


These bullets imply that the action of Gal(K/K) on each V(J;) is transitive. 
(Exercise: Prove this.) Hence all p € V(J;) have the same multiplicity, denoted 
m,. Also note that V(I;) VU;) = 0 for i ¥ j. (Exercise: Prove this.) 

By Proposition 2.1.14, we see that 


CharPoly 7, (u) = I II ey 


i=l peVv(ii) 


Since f has coefficients in K, we see that o(f(p)) = f(q) whenever o € 
Gal(K/K) takes p to g. But we also know that the f(p) are all distinct and K 
is perfect. Thus standard arguments from Galois theory imply that p;(u) = 
Teva, (u — f(p)) is irreducible over K. (Exercise: Supply the details.) It 
follows that the above factorization coincides with the one in the statement 
of the proposition. 

From here, the rest of the proof is similar to what we did in the proof of 
Proposition 2.4.8. The key point as always is that f takes distinct values at 
the solutions. (Exercise: Complete the proof.) 
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The above proof shows that when K is perfect, the m,’s compute the 
multiplicities of the corresponding points. However, this can fail when K is 
not perfect. We should also mention that one can weaken the hypothesis that 
f takes distinct values at the solutions: an analysis of the proof in [Mon02] 
reveals that it is sufficient to assume that f(p) 4 f(q) whenever p and q are 
solutions of (2.36) lying in different orbits of the Gal(K/K)-action. When this 
happens, however, the exponent m; may fail to equal the multiplicity. 

Here is an example of Proposition 2.4.13. 


Example 2.4.14. For the ideal (x? — 2, y? — 2) C Q|z,y] of Example 2.4.12, 
one easily sees that f = x + 2y takes distinct values at the solutions and has 
characteristic polynomial 


CharPoly y,, (u) = (u? — 18)(u? — 2), 


where u? — 18 and u? — 2 are irreducible over Q. By Proposition 2.4.13, we 
get the primary decomposition (x? — 2, y? — 2) = I, M Iz, where 

I, = (a? 2, y? 2, (x 2y)? 18) — fae > 2,2 = y) 

Ip = (x? — 2, y” — 2, (a + 2y)? — 2) = (2? —-2,0+y). 


This is the primary decomposition of Example 2.4.12. (Exercise: Verify this.) 

We could instead have used f = x+y, which has characteristic polynomial 
u?(u? — 8). The function f does not take distinct values on the roots but 
does separate orbits of the Galois action. As noted above, the conclusion of 
Proposition 2.4.13 still holds for such an f. (Exercise: Check that u?(u? — 8) 
leads to the above primary decomposition.) 


We next relate primary decomposition to the factorizations discussed in 
Section 2.4.1. 


Example 2.4.15. As in Section 2.4.1, suppose that f(a), g(x) € Q|a] are irre- 
ducible and a, 3 € C satisfy f(a) = g(G) = 0. Also suppose that we have the 
irreducible factorization 


f(x) = filz)--- fr(a) over Q(B). 


We can relate this to primary decomposition as follows. Pick t € Q such that 
f = «+ ty takes distinct values at the solutions of f(a) = g(y) = 0. In the 
proof of Theorem 2.4.2, we showed that all solutions have multiplicity 1, so 
that we have a factorization 


CharPoly y,,(u) = Li @;(u), 
i=1 


where the ;(u) € Qu] are distinct irreducibles. Proposition 2.4.13 implies 
that the primary decomposition of (f(x), g(y)) C Q|a, y] is 
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(f(x), 9(y)) = ( (F(a), g(y), Bila + ty)), 


and Theorem 2.4.2 asserts that the irreducible factors of f(x) in Q(G)[z] are 
fi(@) = ged(;(a + t8), f(a). 


Since Q()[z] ~ Q|x, y]/(g(y)), there is a polynomial f;(x,y) € Q[a,y] such 
that f;(x, 2) = fi(x) in Q(8)[a]. Then 


(f(x), g(y), Bi(x + ty)) = (o(y), filz, y))- 


(Exercise: Prove this.) Hence the above primary decomposition can be written 


r 


(f(z), 9(y)) = ( (9), filz,y))- 


i=l 


This shows that there is a close relation between primary decomposition and 
factorization. 


There is also a version of Proposition 2.4.13 that uses minimal polynomials 
instead of characteristic polynomials. 


Proposition 2.4.16. Suppose that (fi,..., fs) is zero-dimensional and f € 
K[a1,...,2n] takes distinct values at the solutions of (2.36). If the irreducible 
factorization of MinPolyy,,(u) is 


MinPoly y,, (u) = [[ew”. 
i=1 


where pi(u),...,pr(u) are distinct monic irreducible polynomials, then the 
primary decomposition of (fi,..., fs) 1s given by 


(fi,--5fs) =HhA--- NE, 


where 


[; = (fis ars samlsy™ >: 
Proof. See [ABRW96] or [YNT92]. 


Algorithmic Aspects. From the point of view of doing primary decompo- 
sition algorithmically, one weakness of Proposition 2.4.13 is that f needs to 
take distinct values at the solutions. How do we do this without knowing the 
solutions? This problem was discussed at the end of Section 2.1.5. Another 
weakness of this method is that computing the characteristic polynomial of 
a large matrix can be time-consuming. The timings reported in [Mon02] in- 
dicate that as the number of solutions increases, methods based on [GTZ88] 
outperform the algorithm using Proposition 2.4.13. 

Other approaches to primary decomposition are given in [EHV92] and 
[MMM96]. See also Chapter 5. 
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2.5 Galois theory 


Solving equations has been our main topic of discussion. Since Galois theory 
is also concerned with the solutions of equations, it makes sense that there 
should be some link. As we will see, turning a polynomial equation f(x) = 0 
of degree n into n equations in n unknowns is a very useful thing to do. 

To illustrate our approach, consider the splitting field of x? —a—1 € QJz]. 
Two simple description of this field are 


Q(v5) and Qlyl/{y? — 5). 
However, we will see that the splitting field can also be expressed as 
Qla1, 22|/(a1 +42 —-—1,%1%2+ 1). (2.39) 


Although this may seem more complicated, it has the advantage of giving 
explicit descriptions of the roots (the cosets of x; and x2) and the Galois action 
(permute these two cosets). Note also that the generators of ideal appearing 
in (2.39) make perfect sense since they give the sum and product of the roots 
of 2? —x—1. 

The quotient (2.39) is an example of a splitting algebra. We will see that 
our methods, when applied to general splitting algebras, lead to some standard 
results in Galois theory. We will also show that primary decomposition gives 
an algorithm for computing Galois groups. 


2.5.1 Splitting algebras 


Let K be an infinite field and f(a) € K[a] be a monic polynomial of degree n 
with distinct roots. We will write f(a) as 


f(x) =2" —eqya™14+-.-4+(-1)"G,, G EK. 


The elementary symmetric polynomials o1,...,0n € K[x1,...,2n] are defined 
by the identity 


(2 — 2) -++(2 — 2%) = 2” — oye 1 +--+ +(-1)" on. (2.40) 
Consider the system of n equations in 71,...,2, given by 
01(41,---,%n) —c1 = 0 
Oa(Diys<<42p) 7 = 0 
(2.41) 
On(X1,---;Ln) — Cn = 0. 


The associated algebra is 
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A= K{a1,...,%n]/(o1 — C1,---,On — Cn). 


This is the splitting algebra of f over K. The system (2.41) and the algebra A 
were first written down by Kronecker in 1882 and 1887 respectively (see page 
282 of [Kro31, Vol. II] for the equations and page 213 of [Kro31, Vol. ITT] for 
the algebra). A very nice modern treatment of the splitting algebra appears 
in the recent preprint [EL02]. 


The Universal Property. We first explain why the splitting algebra deserves 
its name. The natural map K[21,...,%] — A takes 0; to c;, so that by (2.40), 
the cosets [x;] € A become roots of f(x). It follows that 


f(a) splits completely over A. 


But more is true, for the factorization of f(a) over A controls all possible ways 
in which f(x) splits. Here is the precise statement. 


Proposition 2.5.1. Suppose that R is a K-algebra such that f(x) splits com- 
pletely over R via 


f(a) = (@—a1)---(@—an), M,...,Qn, € R. 


Then there is a K-algebra homomorphism p: A — R such that this splitting 
is the image under yp of the splitting of f(x) over A. 


Proof. Consider the K-algebra homomorphism @ : K[x,...,2n] — R deter- 
mined by x; + a;. This maps (2.40) to the splitting in the statement of the 
proposition, so that maps o; to c;. Hence (0; — c;) = 0 for all 2, which 
implies that @ induces a K-algebra homomorphism y : A — R. It follows 
easily that y has the desired property. 


The splitting of f(x) over A is thus “universal” in the sense that any other 
splitting is a homomorphic image of this one. 


The Dimension of A. Our next task is to compute dimg A. By Section 2.1.3, 
the dimension is the number of solutions, counted with multiplicity. Let K an 
the algebraic closure of K and fix a splitting 


f(x) = (a —-ay)-+-(a@— an) € Riz]. 


Using this, we can describe the solutions of (2.41) as follows. If ((G1,..., Gn) € 


K’ isa solution, then the substitutions x; ++ 6; take (2.40) to 
f(a) = (@— f;)---(@— Bp) € KI]. 


Thus the 3;’s are some permutation of the a;. Since f(x) has distinct roots by 
hypothesis, there is a unique o € S,, such that 6; = a,(;) for all 7. It follows 
easily that (2.41) has precisely n! solutions given by 
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(afin os<5@etah)» 0 S On: 


We can determine the multiplicities of these solutions as follows. Since o; — ¢; 
has degree 7 as a polynomial in 71,...,%p, Bézout’s theorem tells us that 
(2.41) has at most 1-2-3---n =n! solutions, counting multiplicity. Since we 
have n! solutions, the multiplicities must all be 1. It follows that 


dimg A= nl. 


The Action of S,,. The symmetric group S,, acts on K[x1,...,2] by per- 
muting the variables. Since o; — c; is invariant under this action, the action 
descends to an action of S,, on the splitting algebra A. 


The Emergence of Splitting Fields. Although f splits over A, this algebra 
need not be a field. So how does A relate to the splitting fields of f over K? 
We will analyze this following Kronecker’s approach. 

Since K is infinite, fo = t171 +--:+ tnX, takes distinct values at the 


solutions of (2.41) for most choices of t1,...,¢t, € K. Thus, as o varies over 
the elements of S'),, 
fo(@e1); baie Wei)! — t1Qo(1) See eae tnQo(n) (2.42) 


gives n! distinct elements of K. Since all solutions of (2.41) have multiplicity 1, 
the characteristic polynomial of My, on A is 


CharPoly 7, (u) = [] (u- (rosa) +++: + tnde(ny)) (2.43) 
cESn 
and the linear map My, is non-derogatory. By Proposition 2.1.12, it follows 
that the map sending u to [ty21 + ---tn%,] € A induces a K-algebra isomor- 
phism 
K[u]/(CharPoly yr, (u)) ~ A. 


Now factor CharPoly j, 7, (U) into a product of monic irreducible polyno- 
mials in K[u], say 


CharPoly y,,, (u) = II Gi(u). 
i=1 


Since CharPoly yy, (uw) has distinct roots, the G;(u) are distinct. Hence we get 
K-algebra isomorphisms 


A ~ Klu]/(CharPoly yy, (u)) ~ nl K[u}/(G;(u)) . (2.44) 
i=1 
Each K; is a field, and since the projection map A — K; is surjective, each 
K, is a splitting field of f over K. Thus the factorization of the characteris- 
tic polynomial of My, shows that the splitting algebra A is isomorphic to a 
product of fields, each of which is a splitting field of f over K. 
While the decomposition A ~ J]j_, K; is nice, there are still some unan- 
swered questions: 
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e Are the fields K; isomorphic? 
e When r > 1, A involves several splitting fields. Why? 


We will answer these questions in Sections 2.5.2 and 2.5.3. 


History. The methods described here date back to Galois and Kronecker. 
For example, in 1830 Galois chose t),...,¢n such that the n! values (2.42) are 
distinct and showed that 


V=tya,+---+tran 


is a primitive element of the splitting field. He also used the polynomial on the 
right-hand side of (2.43). In all of this, Galois simply assumed the existence 
of the roots. 

In 1887, Kronecker gave the first rigorous construction of splitting fields. 
His method was to prove the existence of t),...,t, as above and then fac- 
tor CharPoly y,,, (wu) into irreducibles. Letting G;(u) be one of the factors of 
CharPoly y;,,(u), he showed that K[u]/(G;(u)) is a splitting field of f over K. 


2.5.2 Some Galois theory 


We now use the above description of the splitting algebra A to prove some 
standard results in Galois theory. We begin by observing that A has two 
structures: an action of S,, and a product decomposition 


Aw Ik: 
i=l 


where K; is a splitting field of f over K. As we will see, the Galois group arises 
naturally from the interaction between these structures. 

Since the decomposition A ~ [];_, Ki is unique up to isomorphism, it 
follows that for 1 <i<rando € S,, we have o(K;) = K, for some j. Then 
we get the following result. 


Proposition 2.5.2. S,, acts transitively on the set of fields {IKi,...,K,} and 
for eachi=1,...,r, there is a natural isomorphism 


Proof. Under the isomorphism 
K[u]/(CharPoly y,, (u)) ~ A, 


S,, permutes the factors of CharPoly q,,, (w) = [];_, Gi(u). Over K, the fac- 
torization becomes 


CharPoly yy, (u) = Ti (u — (trae) ++++ + tnQa(ny))- 
gESn, 
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This shows that S;, must permute the G(u) transitively. By (2.44), we con- 
clude that S,, permutes the K; transitively. 

For the second assertion, let Gal; = {0 € S;, | o(Ki) = K;}. Since every 
o induces an automorphism of K;, we get an injective group homomorphism 
Gal; — Gal(K;/K). To show that this map is surjective, take 7 € Gal(K;/K). 
Under the projection A — K;, the cosets [a;] map to roots of f(x) lying in 
K,;. Then y must permute these according to some a € S,. Since the roots 
generate K; over K and o permutes the roots, we have o(K;) = K,. It follows 
that o € Gal; maps to y. This gives the desired isomorphism. 


We can use Proposition 2.5.2 to prove some classic results of Galois theory 
as follows. We begin with the uniqueness of splitting fields. 


Theorem 2.5.3. All splitting fields of f over K are isomorphic via an iso- 
morphism that is the identity on K. 


Proof. Let L be an arbitrary splitting field of f over K. Then splitting of f 
over L must come from the universal splitting via a K-algebra homomorphism 
yp: A—L. Furthermore, y is onto since the roots of f generate L over K. 
Using the decomposition A ~ [];_, Ki, we obtain a surjection 


i=1 


It is now easy to see that L ~ K; for some i. (Exercise: Prove this.) Then we 
are done since this is an isomorphism of K-algebras and the K; are mutually 
isomorphic K-algebras by the transitivity proved in Proposition 2.5.2. 


Theorem 2.5.4. # Gal(K;/K) = [K;:K). 


Proof. As above, let Gal; = {o € S', | o(K;) = K;}. Thus Gal; is the isotropy 
subgroup of K; under the action of S, on {Kj,...,K,}. Since this action is 
transitive by Proposition 2.5.2, we see that 


! 
#Gali = —. 
r 
However, we know that the K;,...,K,- are mutually isomorphic. Thus 
n! = dimg A = [K,:K] +---+[K,: RK] =r [K;: RI. 


Combining this with the previous equation gives # Gal; = [K;:K]. Then we 
are done since Gal; ~ Gal(K;/K) by Proposition 2.5.2. 


Theorem 2.5.5. If the characteristic of K doesn’t divide # Gal(K;/K), then 
K is the fixed field of the action of Gal(K;/K) on K,. 


2 Solving equations via algebras 119 


Proof. Let a € K; be in the fixed field and set N = # Gal(K;/K). We may 

assume that a # 0. Let p € K[x,...,2,] map to a; € K; and to 0 € K; for 

j#i. Then P = ees, o-p is symmetric and hence is a polynomial in the 

a; by the Fundamental Theorem of Symmetric Polynomials. In A, this means 

that [P] € K, so that P projects to an element of K in each of Ky,...,K,.. 
Now consider the projection of P onto K,. If 


oma Gal; = {o E Sin | a(K;) = K;} = Gal(K;/K), 
then o - p projects to o(a@) = a. On the other hand, if o ¢ Gal;, then o- p 
projects to 0 since K; N o(K;) = {0} for such o. (Exercise: Prove this.) It 
follows that the projection of P onto K; is Na. Thus Na € K, and then 
a € K follows by hypothesis. 
A more general version of Theorem 2.5.5 is proved in [EL02]. 


Resultants. The characteristic polynomial CharPoly y,,. (uw) is a resultant in 
disguise. More precisely, we claim that 


Resy,1,2,....n(u — fo,01 — €1,--+,0n — Cn) = —CharPoly yy, (u). (2.45) 


To prove this, recall from Theorem 2.3.5 of Section 2.3 that this resultant 
equals the characteristic polynomial multiplied by 


Resj.2 cass n((o1 —c1)1,---;(On — €n)n); 


where (0; — c;); consists of the terms of o; — c; of degree i. This is obviously 
just o;, so that this multiplier reduces to 


Res1.2,....n(01, wees ;On): 


This resultant equals —1 by Exercise 11 of Section 3 of Chapter 3 of [CLO98}. 
Hence we obtain (2.45) as claimed. 


Action of the Symmetric Group. Finally, we will describe the action of 
S;, on the product decomposition A = K; x --- x K,. If we let aij € Kj be 
the projection of [x;] € A onto the jth factor, then a1;,...,Q,; are the roots 
of f(x) in K;. So we have r isomorphic copies of the splitting field together 
with an ordered list of the roots in each field. Let 


é; = (0)..2,0,1,0,0..,0)E Ry ® + & K,, 


where the 1 is in the jth position. Then by abuse of notation we can write 
ajje; € A. Now take o € S,, and suppose that o(e;) = e¢ (this is a precise 
way of saying that o(K;) = Ke). Then one can show without difficulty that 
o([xi]) = [r>(4)] implies that 

O( Qu; €j) = Ag(ae ee. (2.46) 
(Exercise: Prove this.) In the special case when o comes from an element of 
Gal(K,/K), (2.46) gives the action of the Galois group on the roots. The nice 


thing about (2.46) is that it tell us what happens when we apply an arbitrary 
permutation, not just those coming from Gal(K,/K). 
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2.5.3 Primary decomposition 


The Galois group of f consists of all permutations in S,, that preserve the 
algebraic structure of the roots. In this section, we will use primary decom- 
position to describe “the algebraic structure of the roots” and see how the 
Galois group “preserves” this structure. We will work over Q for simplicity. 

Given f = 2™ — ez"! 4+---+(-1)"en € Q{z] as in Section 2.5.1, the 
splitting algebra is 


A= Q[a1,.--,%n]/(o1 — 1,---,;On — Cn), 


where as usual o; is the ith elementary symmetric polynomial. We’ve seen 
that A is a product 
A = II Ki, 
i=1 


where each K; is a splitting field of f over Q. But we also have the primary 
decomposition 


T 
(a1 —C1,-++-,On — Cn) = [te 
i=1 
where J; is maximal in Q[x,...,2,]. These decompositions are related by 


K; = Q[a1,..-,¢n]/Ti (eal eer 


Each J; is larger than (01 — ¢1,...,0n — Cn). The ideal (a1 — c1,...,0n — Cn) 
encodes the obvious relations among the roots, and the polynomials we add 
to get from I; to (a1 — ¢1,...,0n — Cn) reflect the extra algebraic relations 


between the roots that hold in the splitting field K;. Having more relations 
among the roots means that J; is larger and hence K; and the Galois group 
are smaller. 

For instance, if the Galois group of f is S;,, then (01 — ¢1,..-,0n — Cn) 
is a maximal ideal and the splitting algebra is the splitting field. This means 
that the only relations among the roots are the obvious ones relating the 
coefficients to the roots via the elementary symmetric polynomials. 

Let’s see what happens when the Galois group is smaller than S,,. 


Example 2.5.6. Let f = x? — ex? + cox — c3 € Qa] be an irreducible cubic. 
The splitting algebra of f is A = Q[x1, x2, x3]/(01 — C1, 02 — C2, 03 — C3). It is 
well-known that 


53 if A(f) E@ 


ee Culets f f isi hic t 
e Galois group of f is isomorphic to ae if A(f) € Q, 


where A(f) € Q is the discriminant of f. By the above analysis, it follows 
that A is the splitting field of f when A(f) ¢ Q. 
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Now suppose that A(f) = a? for some a € Q. In this case, the splitting 
algebra is a product of two copies of the splitting field, i-e., A = K, x Kg. Let 


VA= (a1 — £2)(x1 x3)(vo x3) Q[x1, 22, x3}. 


In the splitting algebra A, we have [VA]? = [A(f)], so that 


Since A is not an integral domain, this does not imply [VA] = +[a]. In fact, 
[VA] € A cannot have a numerical value since [yA] is not invariant under 
S3. Yet once we map to a field, the value must be ta. But which sign do 
we choose? The answer is both, which explains why we need two fields in the 
splitting algebra. 

In this case, we have the primary decomposition 


(01 — €1,02 — €2,03 03) =KNh, 
where 


Ty = (01 — 1,02 — €2,03 — c3, VA — a) 
In = (01 C1,02 C2, 03 c3, VA +a). 


(Exercise: Prove this.) Note also that this is compatible with the action of $3. 
For example, (12) € $3 maps I; to Iz since (12): VA = —VA. It follows that 
(12) maps K, to Ky in the decomposition A = K, x Kg. This is consistent 
with the description of the S,, action given at the end of Section 2.5.2. 


Example 2.5.6 is analogous to what happens in quantum mechanics when 
an observation forces a mixed state (e.g. a superposition of pure states with 
different energy levels) to become a pure state (with a fixed energy level). In 
Example 2.5.6, the idea is that [V/A]? = [D(f)|? = [a]? means that [vA] is 
somehow a “mixed state” which becomes a “pure state” (i.e., +a € Q) when 
“observed” (i.e., when mapped to a field). 

The quartic is more complicated since there are five possibilities for the 
Galois group of an irreducible quartic. We will discuss the following case. 


Example 2.5.7. Let f = x* — cya? + cox? — c32 + c4 € Qa] be an irreducible 
quartic with splitting algebra 


A = Q[@1, 22, %3, G4] /(01 — C1, 02 — C2,03 — €3,04 — C4). 
One of the tools used in solving the quartic is the Ferrari resolvent 


a? — cox? + (e103 — 4c4)¢ & chica + 4c2ca. (2.47) 


Euler showed that if 31, 32, 83 are the roots of (2.47), then the roots of f are 
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1 
z(art i +f - de + / By + 3 — dey + \/8s + cf — 4c), 


provided the signs are chosen so that the product of the square roots is c? — 


4c1c2 + 8c3. Also, as shown by Lagrange, the roots of the resolvent (2.47) are 
1 AQ +0304, A1A3 + A2ZA4, A1A4 + A2Q3. (2.48) 
The Galois group G of f over Q is isomorphic to one of the groups 
S4, Aa, Dg, Z/4Z, Z/2Z x Z/2Z, 
where Dg is the dihedral group of order 8. Three cases are easy to distinguish: 


Sa if A(f) ¢ Q? and (2.47) is irreducible over Q 
Gr 4 Ag if A(f) € Q? and (2.47) is irreducible over Q 
Z/2Z x Z/2Z if A(f) € Q? and (2.47) is reducible over Q. 


The remaining case is when A(f) ¢ Q? and (2.47) has a root in Q. Here, the 
Galois group is Dg or Z/4Z. We state without proof the following nice fact: 


G~ Ds — A(f) ¢Q®, (2.47) has a root b € Q and 


(01 — €1,02 — C2,03 — 3,04 —ca) =H NN, 


is the primary decomposition, where 


Ly = (01 — C1, a9 — C203 — 3,04 — Ca, 2182+ 2304 — b) 


Ig = (01 — C1, 02 — €2, 03 — €3,04 — Ca, 4103 + L2r4 — b) 


I3 = (01 — C1, 02 — €2,03 — €3,04 — C4, 2104 + L2%3 — b). 


The reason for three ideals is that b is one of the three combinations of roots 
given in (2.48). To get a field out of the ideal (01 — c,, 72 — C2, 03 — C3, 74 — C4), 
we must commit to which combination gives b. This gives the ideals J,, Ig, Is 
as above. 


The Galois group. We also observe that primary decomposition gives an 
algorithm for computing the Galois group of f = «”—ca"~!+---+(-1)"en. 
To do this, pick fo = t1@1 +++: + tna such that My, : A — A is non- 
derogatory and let 


CharPoly y,,, (u) = li G;(u) 
i=1 


be the irreducible factorization of the characteristic polynomial in Q[u]. Then 
Proposition 2.4.13 implies that we have the primary decomposition 


7 
(01 —¢1,.+-50n — en) = (Li 
i=l 
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where 
i= onl = Cls nie On €n, Gi(fo)). 


Furthermore, S,, permutes the J; since (01 — ¢1,..-,0n — Cn) is invariant, and 
Gal(K;/Q) ~ {o € S, | oi) = Ii}. 


Using a Grobner basis of I;, we can determine whether o(J;) equals I; for any 
given o € S,,. Hence, by going through the elements of S,, one-by-one, we get 
a (horribly inefficient) algorithm for computing the Galois group. However, 
in simple examples like Examples 2.5.6 or 2.5.7, the Galois group is easy to 
determine from the primary decomposition. (Exercise: Do this computation.) 

Finally, we note that many of the ideas in Section 2.5 are well-known 
to researchers in computational Galois theory. See, for example, [AV00] and 
[PZ89]. 
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Summary. This tutorial gives an introductory presentation of algebraic and geo- 
metric methods to solve a polynomial system f; = --- = fm = 0. The algebraic 
methods are based on the study of the quotient algebra A of the polynomial ring 
modulo the ideal J = (f1,..., fm). We show how to deduce the geometry of solutions 
from the structure of A and in particular, how solving polynomial equations reduces 
to eigenvalue and eigenvector computations of multiplication operators in A. We 
give two approaches for computing the normal form of elements in A, used to ob- 
tain a representation of multiplication operators. We also present the duality theory 
and its application to solving systems of algebraic equations. The geometric meth- 
ods are based on projection operations which are closely related to resultant theory. 
We present different constructions of resultants and different methods for solving 
systems of polynomial equations based on these formulations. Finally, we illustrate 
these tools on problems coming from applications in computer-aided geometric de- 
sign, computer vision, robotics, computational biology and signal processing. 


3.0 Introduction 


Polynomial system solving is ubiquitous in many applications such as com- 
puter geometric design, geometric modelling, robotics, computer vision, com- 
putational biology, signal processing, ... Specific methods like minimization, 
Newton iterations, ...are often used, but do not always offer guarantees on 
the result. In this paper, we give an introductory presentation of algebraic 
methods for solving a polynomial system f; = --- = fm = 0. By a reformu- 
lation of the problem in terms of matrix manipulations, we obtain a better 
control of the structure and the accuracy of computations. The tools that 
we introduce are illustrated by explicit computations. A MAPLE package im- 
plements the algorithms described hereafter and is publicly available on the 
Internet?. We encourage the reader to use it for his own experimentation on 


3 http://www. inria.fr/galaad/logiciels/multires/ 
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the examples illustrating the presentation. For more advanced computations 
described in the last section, we use the C++ library SYNAPS available on the 
Internet*. Our approach is based on the study of the quotient algebra A of the 
polynomial ring by the ideal (f1,..., fm). We describe, in the first part, the 
well known method of Grébner basis to compute the normal form of elements 
in A which yields the algebraic structure of this quotient. We also mention a 
recent generalization of this approach which allows to combine, more safely, 
symbolic and numeric computations. 

In the second part, we show how to deduce the geometry of solutions from 
the structure of A. In particular, we show how solving polynomial systems 
reduces to the computation of eigenvalues or eigenvectors of operators of mul- 
tiplication in A. In the real case, we also show how to recover information on 
the real roots from this algebra. 

We also study duality theory and show how to use it for solving polynomial 
systems. 

Another major operation in effective algebraic geometry is projection. It is 
related to resultant theory. We present different notions and constructions of 
resultants and we derive methods to solve systems of polynomial equations. In 
practice, according to the class of systems that we want to solve, we will have 
to choose the resultant construction adapted to the geometry of the problem. 
Finally, we illustrate these tools on problems coming from several areas of 
applications. 

For more details on the material presented here, see [EM]. 


3.1 Solving polynomial systems 


The problem of solving polynomial equations goes back to the ancient Greeks 
and Chinese. It is not surprising that a large number of methods exists to 
handle this problem. We divide them into the following families and we will 
focus essentially on the last two classes. 


3.1.1 Classes of solvers 
Analytic solvers 


The analytic solvers exploit the value of the functional f = (f1,..., fm) and 
its derivatives in order to converge to a solution or all the solutions of f = 0. 
Typical examples are Newton-like methods, Minimization methods, Weier- 
strass’ method [Dem87, $593, Bin96, MR02]. 


4 http://www. inria.fr/galaad/logiciels/synaps/ 
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Homotopic solvers 


The idea behind the homotopic approaches is to deform a system with known 
roots into the system f = 0 that we want to solve. Examples of such con- 
tinuation methods are based on projective [MS87b], toric [Li97, VVC94] or 
generally flat deformations of f = 0. See Chapter 8 and [AG90b] for more 
details. 


Subdivision solvers 


The subdivision methods use an exclusion criterion to remove a domain if 
it does not contain a root of f = 0. These solvers are often used to isolate 
the real roots, if possible. Exclusion criteria are based on Taylor’s exclusion 
function [DY93], interval arithmetic [Kea90], the Turan test [Pan96], Sturm’s 
method [BR90, Roy96], or Descartes’ rule [Usp48, RZ03, MVY02]. 


Algebraic solvers 


This class of methods exploits the known relations between the unknowns. 
They are based on normal form computations in the quotient algebra [CLO97, 
MTO00, MT02] and reduce to a univariate or eigenvalue problem [Mou98}. 


Geometric solvers 


These solvers project the problem onto a smaller subspace and exploit geo- 
metric properties of the set of solutions. Tools such as resultant constructions 
[(GKZ94, EM99b, BEM00, BEMO1, BusO1a] are used to reduce the solutions 
of the polynomial system to a univariate or eigenvalue problem. This reduc- 
tion to univariate polynomials is also an important ingredient of triangular 
set methods [Tsii94, Wan95, ALMM99]. 


3.1.2 Notation 


We fix the notation that will be used hereafter. Let K be a field, K be its 
algebraic closure, R = K[a1,...,2@n] = K[x] be the algebra of polynomials in 
the variables x = (x1,...,2n) with coefficients in K. For the sake of simplicity, 
we will assume that K is of characteristic 0. 

Let fi,..., fm € R be m polynomials. Our objective is to solve the system 
fi =90,...,fm = 0, also denoted by f = 0. If a = (qy,...,a,) € N", Ja] = 
Oy Fees + On, X* = aft... wor, 

Let I be the ideal generated by fi,..., fm in R and Z(J) be the affine 
variety {C € K” : fi(¢) = --- = fm(C) = O}. We will assume that Z(I) = 
{¢1,..-,¢a} is a non-empty and finite set. The algebraic approach to solve the 
system f = 0 is based on the study of the K-algebra A = R/I. The hypothesis 
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that Z(J) is finite implies that the K-vector space A is of finite dimension over 
K, see Theorem 2.1.2 in Chapter 2. We denote by R (resp. A) the dual of the 
vector space R (resp. A). 

Algebraic solvers exploit the properties of A, which means that they must 
be able to compute effectively in this algebra. This can be performed by a so- 
called normal form algorithm. We are going to describe now two approaches 
to compute normal forms. 


3.1.3 Gr6bner bases 


Grobner bases are a major tool in effective algebraic geometry, which yields 
algorithmic answers to many question in this domain [CLO97, BW93, AL94, 
Eis95]. It is related to the use of a monomial ordering. 


Definition 3.1.1. A monomial ordering is a total order > on the set of mono- 
mials of K[x] such that 


i) Va#0, 1 <x, 
ti) V(a, B,y) € (N”)3, if x® <x? then x°+7 < xP+7, 


Some well known monomial orderings are defined as follows: 

Let a = (aj,...,Q@,) € N” and 6B = (1,..., Bn) € N”. 

— The lexicographic ordering with 4, > +--+ > @: x® <; x? iff there exists 
7 such that ay = Br, 2 QS Bi, O44 < Bi41. 

— The graded lexicographic ordering with 71 > --- > a: x® <gi x? iff 
|a| < |G] or (ja| = |B] and x* <; x®). 

Given a monomial ordering >, we define as in the univariate case, the 
leading term of p € R as the term (the coefficient times its monomial) of p 
whose monomial is maximal for >. We denote it by Ls (p) (or simply L(p)). 
We write every p € Ras p = agx®+4+---+a;x™, with a; 4 Oand ag > --- > qu. 

Let f, fi,.--;fm © R. As in the Euclidean division there are polynomi- 
als q1,---;Gm,r such that f = aft +--:+dmfm +r, where no term of r 
divides any of L(f1),...,£(fm) (in this case we say that r is reduced with re- 
spect to f1,..., fm). This is the multivariate division of f by fi,..., fm. The 
polynomials qi,...,@m are the quotients and r the remainder of this division. 

If I is an ideal of R = K[x], we define £5 (I) (or simply L(Z)) to be the 
ideal generated by the set of leading terms of elements of I. 

By Dickson’s lemma [CLO97] or by Noetherianity of K[x], this ideal £, (J) 
is generated by a finite set of monomials. This leads to the definition of 
Grobner bases: 


Definition 3.1.2. A finite subset G = {g1,..., 94} of the ideal I is a Grobner 
basis of I for a given monomial order > iff Ls (1) = (Ls(g1),---,L>(ge))- 


Some interesting properties of a Grobner basis G are: 
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— For any p € R, the remainder of the multivariate division of p by G is 
unique. It is called the normal form of p modulo the ideal J and is denoted 
by N(p) (see [CLO97]). 

— The polynomial p € J iff its normal form N(p) = 0. 

— A basis B of the K-vector space A = R/T is given by the set of monomials 
which are not in Ls (I). This allows us to define the multiplication table by an 
element a € A: We multiply first the elements of B by a as usual polynomials 
and then normalize the products by reduction by G. 

The ideal J can have several Grobner bases but only one which is reduced 
(i.e. the leading coefficients of elements of G are equal to 1, and every g EG 
is reduced with respect to G \ {g}). Efficient algorithms and software have 
been developed over the past decades to compute reduced Groébner bases. We 
mention in particular [Fau99], [GS], [GPS01], [Roba]. 


Example 3.1.3. Let I be the ideal of R = Q[x1, x2] generated by 


fic= 13.22 +82 29 | das 82, —8%2+2 and fo = af + 1&2 — 21 — F. 


The reduced Grobner basis G of I for the graded lexicographic ordering with 
“1 > £2 is (on Maple): 


> with(Groebner); G:= gbasis([f1,f£2] ,tdeg(x[1] ,x[2])); 


(30 x12 — 30a, — 25 — 24a? + 48.29, 15 217 +12 29” — 24224 10, 
216 2° — 648 xo” + 521 + 632 x2 — 200). 


The leading monomials of elements of G are 21 12,217, 72°. Then a basis of A 
is {1,21, x2, 73}. Using the reduction by G, the matrix of multiplication by x1 
in this basis is: 


> L:= map(u->normalf(u,G,tdeg(x[1],x[2])), 
> [x[1] ,x[1]°2,x(1]*x(2] ,x[1]*x[2]*2]); 


3 
: £2+8/5 224 : ait 


83 
(a1, —4/5 £27 +8/5 22—2/3, 21+5/64+4/5 22”—8/5 x0, a7 


> matrixof (L, [[1,x[1] ,x[2],x[2]°2]]); 


0-2/3 5/6 % 
cy. 1 = 
0 8/5 —8/5 — $38 
0-4/5 4/5 8/5 


This is the matrix of coefficients of elements of the monomial basis multiplied 
by x1, expressed in this basis. 
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Since the variety Z(J) is finite, a lexicographic Grébner basis with x, > 
+++ > a1 contains elements gi,...,9n such that g; € K{a1,...,2;] and L(g;) 
depends only on x;. This reduces the problem of solving f = 0 to solving a 
triangular system, hence to the problem of finding the roots of a univariate 
polynomial. Unfortunately the lexicographic Grobner bases are not used in 
practice because of their high complexity of computation. We proceed as fol- 
lows: First we compute a Grobner basis for another monomial ordering and 
then we use a conversion procedure to obtain a lexicographic one. For more 
details see for instance [FGLM93]. 


3.1.4 General normal form 


The construction of Grobner bases may not be numerically stable, as shown 
in the following example: 


Example 3.1.4. Let 
> £1:= x[1]°2+x[2]°2-x[1]+x[2]-2; £2:= x[1]72-x[2]°2+2*x[2]-3; 


The Grobner basis of (f1, f2) for the graded lexicographic ordering with x, > 
LQ is: 


> G:=gbasis([f1,f2] ,tdeg(x[1],x[2])); 


(2.29? = F1 = % +1,20,7 — 24 +329 —5). 


The leading monomials of elements of G are x7 and x3. A monomial basis 


of A is {1,21, 22,2122}. Consider now a small perturbation of the system 
fi = fe =0 and compute its Grobner basis for the same monomial ordering: 


> gbasis([f1,£2+1.0/10000000*x [1] *x[2]] ,tdeg(x[1] ,x[2])); 


( 2257 + £1 + Xo 1 + 0.0000001 TX, x4? + x9? — @+%Q— 2, 
x2° — 10000000.9999999999999950000000000000125 a? 
+5000000.2500000124999993749999687500015625000781250 x, 
+5000000.7500000374999931249999062500171875002343750 x2 
—5000000.2500000624999993749998437500015625003906250). 


The leading monomials of this Grébner basis are #1 #2, 217,22? and the cor- 
responding basis of the perturbed algebra is {1,21, 72,73}. After a small per- 
turbation, the basis of the quotient algebra may “jump” from one set of mono- 
mials to another one, though the two set of solutions are very close from a 
geometric point of view. Moreover, some polynomials of the Grobner basis of 
the perturbed system have large coefficients. 


Thus, Grébner bases computations may introduce artificial discontinuities due 
to the choice of a monomial order. A recent generalization of this notion has 
been proposed in [Mou99, MTO0]}. It is based on a new criterion which gives 
a necessary and sufficient condition for a projection onto a vector subspace of 
R to be a normal form modulo the ideal I. More precisely we have: 
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Theorem 3.1.5. Let B be a vector space in R = K[x1,...,%n] connected to 
the constant polynomial 1°. If B+ is the vector subspace generated by BU 
x, BU...Ua,B, N: Bt = B is a linear map such that N is the identity on 
B, we define fori =1,...,n, the maps 


M,;:B—-8B 
bre M;(b) := N(a,b). 


The two following properties are equivalent: 


1. For all 1 <i,j7 <n, Mj o M; = Mj; 0 M;. 
2,.R=B@I, where I is the ideal generated by the kernel of N. 


If this holds, the B-reduction along ker(N) is canonical. 


In Chapter 4, you will also find more material on this approach and a proof 
of Theorem 3.1.5, in the special case of 0-dimensional ideals. 

This leads to a completion-like algorithm which starts with the linear 
subspace Ko generated by the polynomials f;,..., fm, which we wish to solve, 
and iterates the construction Kj.1 = K. i OL, where L is a fixed vector space. 
We stop when Kj, = K;. See [Mou99, MT00, Tré02] for more details. This 
approach allows us to fix first the set of monomials on which we want to do 
linear operations and thus to treat more safely polynomials with approximate 
coefficients. It can be adapted very naturally to Laurent polynomials, which is 
not the case for Grobner bases computations. Moreover it can be specialized 
very efficiently to systems of equations for which the basis of A is known a 
priori, such as in the case of a complete projective intersection [MTO0]. 


Example 3.1.6. For the perturbed system of the previous example, the normal 
forms for the monomials on the border of B = {1,21, 72,712} are: 


x1? = —0.00000005 2129 + 1/221 — 3/222 + 5/2, 
x22 = +0.00000005 2129 + 1/22, + 1/222 — 1/2, 
11°22 = 0.49999999 x22 — 0.74999998 x1 + 1.75000003 a2 + 0.74999994, 
1127 = 0.49999999 x1 22 — 0.25000004 x1 — 0.74999991 x2 + 1.25000004. 


This set of relations gives the matrices of multiplication by the variables x1 
and v2 in A. An implementation by Ph. Trébuchet of an algorithm com- 
puting this new type of normal form is available in the SYNAPS library (see 
Solve(L,newmac<C>())). 


3.2 Structure of the quotient algebra 


In this section we will see how to recover the solutions of the system f = 0 
from the structure of the algebra A, which we assume to be given through a 
normal form procedure. 


B 


5 Any monomial x* 4 1 € B is of the form 2;x® with x® € B and some i in 


sl ene 7) oe 
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3.2.1 Dual of the quotient algebra 


First we consider the dual R that is, the space of linear forms from R to 
K. The evaluation 1¢ at a fixed point ¢ is an example of such linear forms: 
p€ Rt 1¢(p) := p(¢) € K. Another class of linear forms is obtained by 


differential operators, namely for a = (a1,...,Q) € N”, 
d*: R-K 
1 
——_ ((01) ++ - (On) °™ pp) (0), 
De pay (1% = (On)™*P) (0) 


where 0; is the derivative with respect to the variable x; (see also Section 
2.2.2 of Chapter 2). If a = (a1,...,@n) € N” and @ = ((1,...,8n) € N”, 


x = os) — Jlifa;=6,fori=1,...,n 
q (I 4 ) 7 e otherwise. 
It follows that (d%) jenn is the dual basis of the monomial basis (x°)qenn of 
R. Notice that (d%) enn can be defined for every characteristic. We assume 
again that K is a field of arbitrary characteristic. We deduce that for every 
A€ R we have A= oven A(x®) d®. 

The vector space {))ycnn Ca d{* ...d%" : cq € K} (where d;* denotes 
the map p € Rt at (OMp )(0)) of formal power series in dj,...,d, with 
coefficients in K is denoted by K[[d]] = K[[d,,...d,]]. The linear map 

AER YS A(x*)d* € K[[d]] 
aceN” 


defines a one-to-one correspondence. So we can identify R with K|[d]]. Under 
this identification, the linear form evaluation at 0 corresponds to the constant 
power series 1; it is also denoted d°. 


Example 3.2.1. Let n = 3. The value of the linear form 1+ d,; + 2d ,d2+ d;” 
on the polynomial 1 + x7; + 2122 is: 
(1+ dy +2dydo+d3”)(1+ 21 +2122) =4. 
The dual & has a natural structure of R-module: For (p, A) € Rx R, 
p-A:qe€ Rr (p- A)(q) = A(pq) EK. 


If p € R and a; € N*, we check that d?‘(x;p) = mai : (O%~*p) (0). Conse- 
quently, for p € R and a = (a4,...,an) € N” with a; 4 0 for a fixed i, we 
have 


(aj -d°)(p) = d*(a;p) = dp? ---dfy' df d7/7" «din (p). 


That is, x; acts as the inverse of d; in K[[d]]. This is the reason why in the 
literature such a representation is referred to as the inverse system (see for 
instance [Mac94]). If a; = 0, then x; -d® = 0. Then we redefine the product 
p- Aas follows: 
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Proposition 3.2.2. (see also [MP00], [Fuh96]) For p © R and A € K[[d]], 
pA, (p(dy"y.«:,d5,") Ald), 


where 1+ is the projection on the vector space generated by the monomials 
with positive exponents. 


Example 3.2.1 (continued). 


(l+a1 + 2122) : (l+d, + did2 + d3”) = 3+d, sidyde pads” + do. 


The constant term of this expansion is the value of the linear form 1+ d, + 
d, dy + d3” at the polynomial 1+ 71 + 27129. 


3.2.2 Multiplication operators 
Since the variety Z(J) is finite, the K-algebra A has the decomposition 
A=A.06:-:-@Aa, (3.1) 


where A; is the local algebra associated with the root ¢; (see also Section 2.7, 
Chapter 2). So there are elements e1,...,e, € A such that 


e,+---+eq=l, e? =e;, ee; =O0if i Fj. 
These elements are called the fundamental idempotents of A, and generalize 
the univariate Lagrange polynomials. They satisfy A; = e;A and e;(¢;) = 1 if 
i = j and 0 otherwise. The dimension of the K-vector space A; is by definition 
the multiplicity of the root ¢;, and it is denoted by uc, . 

We recall that a linear form on A can be identified with a linear form on R 
which vanishes on the ideal J. Thus the evaluation 1¢, which is a linear form 
on R, is an clement of A iff ¢ € Z(I). 

The first operators that come naturally in the study of A are the operators 
of multiplication by elements of A. For any a € A, we define 


M,:A-A 
br M,(b) := ab. 


We also consider its transpose operator 


MerAsaA 
Aw Mt(A) = Ao Mg. 


The matrix of M* in the dual basis of a basis B of A is the transpose of the 
matrix of M, in B. 


Example 3.1.3 (continued). Consider the matrix M,, of multiplication by x71 
in the basis B = {1,21, 22,2142} of A = K{21, x2]/(fi, fo): We multiply the 
monomials of B by x; and reduce the products to the normal forms, so 
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1xXa=% , %4XX= Ride here > ©2X% 0, =2122 , 
7 _ 55 2 5 
= -212244 m4 v4 . 
Ty1%Q XxX Ly 122 5A 97°? 5d 
Then ; s 
0% Os7 
110% 
M = i“ 
000% 
0-11-1 


The multiplication operators can be computed using a normal form algo- 
rithm. This can be performed, for instance by Grébner basis computations 
(see Sections 3.1.3 and 3.1.4). In Section 3.5, we will describe another way 
to compute implicitly these operators based on resultant matrices (see also 
Section 2.3, Chapter 2). 

Hereafter, x” = (x)acee denotes a monomial basis of A (for instance 
obtained by a Grébner basis). Then any polynomial can be reduced modulo 
(f1,.--,fm) to a linear combination of monomials of x”. 

The matrix approach to solve polynomial systems is based on the following 
fundamental theorem: 


Theorem 3.2.3. Assume that Z(I) = {G1,...,¢a}. We have 


1. Let a € A. The eigenvalues of the operator M, (and its transpose M*) 
are a(¢1),..-,a(Ca). 


2. The common eigenvectors of (M%)aca are (up to a scalar) 1¢,,...,1¢ 


Proof. 1) Let i € {1,...,d}. For every b € A, 


(Ma(1¢.)) (6) = 1¢,(ab) = (a(¢i) 1¢,) ()- 


This shows that a(¢,),..., a(Ga) are eigenvalues of M, and M§, 1¢, is an eigen- 
vector of Mf associated with a(¢;), and 1¢,,...,1¢, are common eigenvectors 
to Mi,ac JA. 

Now we will show that every eigenvalue of M, is one a(¢;). For this we 


consider 
p(x) = J] (a(x) -a(¢)) € Kx. 


ce Z(1) 


a’ 


This polynomial vanishes on Z(J). Using Hilbert’s Nullstellensatz we can find 
an integer m € N such that the operator 


p™(M.)= |[ (M.-a(¢)1)™ 
ceZ(L) 


vanishes on A (I is the identity operator). We deduce that the minimal poly- 
nomial of the operator M, divides Tees (T - a(¢))””, and that the eigen- 
values of M, belong to {a(¢) : ¢ € Z(J)}. 
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2) Let A€ A be a common eigenvector to M*, a € A, and y = (71,---, Yn) 
such that M?.(A) = yA for i=1,...,n. Then all the monomials x® satisfy 


(ME,(A)) (x) = A(aix®) = A(x”), 


From this we deduce that A = A(1)1,. As Ae A= I+, A(p) = A(1)p(7) = 0 
for every p€ I, and 1, € A. 


Since x” = (x®)aer is a basis of A, the coordinates of 1¢, in the dual 


basis of x” are (C*)aex. Thus if x” contains 1,21,...,2n (which is often the 
case), we deduce the following algorithm: 


Algorithm 3.2.4 SOLVING IN THE CASE OF SIMPLE ROOTS. 


Leta € A such that a(¢;) 4 a(¢;) fori 4 j (which is generically the case) and 
M, be the matrix of multiplication by a in the basis x” = (1,21,...,2pn,.-..) of 


A. 


1. Compute the eigenvectors A = (Aj, Ax,,...,Ax,,---) of ME. 

2. For each eigenvector A with Ay 4 0, compute and output the point ¢ = 
Ae Agn 
Aa ret Ay 


The set of output points ¢ contains the simple roots (i.e. roots with multi- 
plicity 1) of f = 0, since for such a root the eigenspace associated to the 
eigenvalue a(¢) is one-dimensional and contains 1¢. But as we will see in the 
next example, it can also yield in some cases the multiple roots. 


Example 3.1.3 (continued). The eigenvalues, their multiplicities, and the cor- 
responding normalized eigenvectors of the transpose of the matrix of multi- 
plication by x, are: 


> neigenvects(transpose(Mx1) ,1); 


1 15 5 1 17 7 
{og2i= (pg cae) po {p= Osgerre) 

As the basis of A is (1,21, 22,7122), we deduce from Theorem 3.2.3 that the 
solutions of the system f; = f2 = 0 can be read off from the 2"4 and the 
3"? coordinates of the normalized eigenvectors: So Z(I) = {(—3, 2), (3, @)}- 
Moreover, the 4” coordinates of V; and V2 are the products of the 2”¢ by the 
3”¢ coordinates. In this example the multiplicity 2 of the two eigenvalues is 
exactly the multiplicity of roots ¢; and C2 (see Chapter 2, Proposition 2.1.14). 

In order to compute exactly the set of roots counted with their multiplicity, 
we use the following result. It is based on the fact that commuting matrices 
share common eigenspaces and the decomposition (3.1) of the algebra A. 


Theorem 3.2.5. [Mou98, MP00, CGT97] There exists a basis of A such that 
for alla € A, the matrix of M, in this basis is of the form 
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nt 0 a(G) * 
Ma = He with Ni = 
Oo we 0 a(Ci) 
Proof. For every i € {1,...,d}, the multiplication operators in A; by elements 
of A commute. Then using (3.1) it is possible to choose a basis of A; such 


that the multiplication matrices NY by a € A in A; in this basis are upper- 
triangular. By theorem 3.2.3, N‘, has one eigenvalue, namely a(¢;). 


We deduce the algorithm: 


Algorithm 3.2.6 SOLVING BY SIMULTANEOUS TRIANGULATION. 

INPUT: Matrices of multiplication M,,,i=1,...,n, in a basis of A. 

1. Compute a (Schur) decomposition P such that the matrices T; = PM,,P~1, 
i=1,...,n, are upper-triangular. 

2. Compute and output the diagonal vectors t; = (tis, ...,t2;) of triangular 
matrices Ty = (Jags 


OUTPUT: Z(J) = {t;:7=1,...,dimg(A)}. 


The first step in this algorithm is performed by computing a Schur decompo- 
sition of M, (where | is a generic linear form) which yields a change of basis 
matrix P. Then we compute the triangular matrices T; = PM,,P~+, i =1,...,n, 
since they commute with M). 


3.2.3 Chow form and rational univariate representation 


In some problems it is important to have an exact representation of the roots 
of the system f = 0. We will represent these roots in terms of solutions of a 
univariate polynomial. More precisely, they will be the image of these solutions 
by a rational map. The aim of the foregoing developments is to show how to 
construct explicitly such a representation. 


Definition 3.2.7. The Chow form of the ideal I is the homogeneous polyno- 
mial in u = (uo,.--,Un) defined by 


Cr(u) = det(uo + U1 Mey + +++ + Un Mez,,) € Kul. 
According to Theorem 3.2.5, we have: 


Proposition 3.2.8. The Chow form 


Cru)= J] (uo tui +--+ tnt). 
¢eZ(I) 
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Example 3.1.3 (continued). The Chow form of I = (f1, fo) using the matrices 
of multiplication by x; and 22 is: 


> factor(det(u[0]+ uli] *Mx1+ u[2]*Mx2)); 


Bee 14) . 
Uo 3 6 v2 Uo 3 gy 


It is a product of linear forms whose coefficients yield the roots ¢; = (-3, 2) 
and 2 = (3; z) of f; = fo = 0. The exponents are the multiplicities of the 


roots (here 2). When the points of Z(J) are rational (as in this example) we 
can easily factorize C;(u) as a product of linear forms and get the solutions 
of the system f = 0. But usually, this factorization is possible only on an 
algebraic extension of the field of coefficients (see Chapter 9 for more details 
on this task). 

From the Chow form, it is possible to deduce a rational univariate repre- 
sentation of Z(1): 


Theorem 3.2.9. (see [Ren92, ABRW96, Rou99, EM99a, Lec00]) Let A(u) 
be a multiple of the Chow form C;(u). For a generic vector t € K"*+ we write 
A 
ged(A, 5.5) 
where dj(uo) € K[uo], R(u) € (ui,-..,Un)?, gcd(do(uo), do’(uo)) = 1. Then 
for all ¢ € Z(1), there exists a root Go of do(uo) such that 
do(Co)"* do(¢o) 


Proof. We decompose A(u) as 


(t + u) = do(uo) + urdi (uo) +-+-+Undn(uo) + R(u) , 


At) = (TT (uo + Gaus Fost Gin)" Cu) 


with ng € N*, where [Jee g(p(uotGuit:--+GnUn)"s and H(u) are relatively 
prime. Let 


40) = aay Bay = (LL tot Sinn t+ Gata) C0) 


ce Z (I) 


where []-ez(z)(uo + Giui ++-++ Gnun) and h(u) are relatively prime. If t = 
(t1,..-,tn) € K” and t = (0,t1,...,tn) € K"*+, we have 


d(t + u) = ( I] (40 +u0+ Gum +---4 Gut) Jat 9) 


eZ (I) 
do(uo) + uidy(uo) +++++ Undn(uo) + r(u) , 
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with (t,¢) =tiG1 +---+tn€n, do,.-.,;dn € K[uo], r(u) € (w1,---,Un)?, and 
h(t + u) = ho (uo) + uzh1(ug) feeet UnNn (uo) + s(u) , 


with ho,..., hn € K[uo] and s(u) € (w1,...,Un)*®. By identification 


do(uo) = ( II ((,6) + uo) tau) , and fori=1,...,n, 


eZ (I) 
ds(uo) = (SS GIT (68) + 10) )nolua) + (TP ((t6) +) ) a) 
¢eZ(1) EAC ceZ (I) 


Ift € K” is generic, []-e2(1) ((t,¢) + uo) and ho(uo) are relatively prime. Let 
¢o = —(t,¢) be a root of do(uo), then ho(Go) 4 0 and 


do'(¢o) = (II ((£)= (,0)) oto) ‘ 


SAC 


alco) = G(T] (8) - 60) JholG) for ¢= A)... 


SAC 


Moreover we can assume that the generic vector t is such that (t,¢) F (t,&) 
for (¢,€) € Z(1)? and ¢ 4 €. Then 


_ dil) 
do'(¢o)’ 


This result describes the coordinates of solutions of f = 0 as the image 
by a rational map of some roots of do(uo). It does not imply that any root of 
do(uo) yields a point in Z(I), so that this representation may be redundant. 
However the “bad” prime factors in do(uo) can be removed by substituting 
the rational representation back into the equations f1,..., fm. 

In Proposition 3.5.4 we will see how to obtain a multiple of C;(u) without 
the knowledge of a basis of A. 


fori =1,...,n. 


Gi 
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Algorithm 3.2.10 RATIONAL UNIVARIATE REPRESENTATION. 
INPUT: A multiple A(u) of the Chow form of the ideal I = (fi,..., fm). 


1. Compute the square-free part d(u) of A(u). 
2. Choose a generic t € K” and compute the first terms of 


d(t + u) = do (uo) + Uy d; (uo) tere t Un dn, (uo) f.+- 


3. Compute the redundant rational representation 


do(uo) =0 , 


4. Factorize do(uo), keep the “good” prime factors and output the rational 
univariate representation of Z(1). 


Example 8.1.3 (continued). From the Chow form, we deduce the univariate 
representation of Z(J): 


(wets) (wg) <0 S0= Caray Bate) 


This gives the solutions 


of fy = fo =0. 


3.2.4 Real roots 


Now we assume that the polynomials f,,..., fm, have real coefficients: K = R. 
A natural question which arises in many practical problems is how many real 
solutions does the system f = 0 have ? We will use properties of the linear 
form trace to answer this question. 


Definition 3.2.11. The linear form trace, denoted by Tr, is defined by 
Tr:A—-R 
at Tr(a) := tr(M,), 
where tr(M,) is the trace of the linear operator Mg. 
According to Theorem 3.2.5, we have 


Tr= x Le¢ Le. 


ceZ(J) 
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We associate to Tr and to any h € A the quadratic form: 
Qn: (a,b) € Ax AB Qp(a, b) := Tr(hab) € R, 


which gives the following generalization of a result due to Hermite for counting 
the number of real roots. 


Theorem 3.2.12. (See [PRS93, GVRR99/) Let h € R[x]. We have: 


1. The rank of the quadratic form Q»p, is the number of distinct complex roots 
¢ of f =0 such that h(¢) 40. 
2. The signature of Qn is equal to 


ALC ER": fi(¢) = ++ = fm (9) = 0, A(G) > OF- ALC ER": AQ =: = 
fim(C) = 0, h(C) < 0}, where # denotes the cardinality of a set. 


In particular, if h = 1, the rank of Q, is the number of distinct complex 
roots of f = 0 and its signature is the number of real roots of this system. 
This allows us to analyze the geometry of the real roots as illustrated in the 
following example: 


Example 8.1.3 (continued). By direct computations, we have 


Tr(1)=4, Tr(#1) =0, Tr(2) =4, Tr(2122) = =. 


We deduce the value of the linear form Tr on the other interesting monomials 
by using the transpose operators Mf, as follows: 


> TO := evalm([4,0,4,2/9]): 

> Tl := evalm(transpose(Mx1)&*TO): T2:= evalm(transpose (Mx2)&*TO) : 
> Ti1 := evalm(transpose(Mx1)&*T1): T12:= evalm(transpose (Mx2) &*T1): 
> T112:= evalm(transpose(Mx2)&+*T11): 

> Q1 := matrix(4,4, [TO,T1,T2,T12]); 

> Qx1 := matrix(4,4, [T1,T11,T12,T112]); 


So we obtain 


2 4 2 4 

12 | ag 42 
Q=|qbh i], Qu=[ 8a 1B 
21d dt Loa 

99 9 81 9 81 81 81 


The rank and the signatures of the quadratic forms @; and Q,, are 


> rank(Q1), signature(Q1), rank(Qx1), signature (Qx1) ; 


2, (2,0), 2, (1,1), 


which tells us (without computing these roots) that there are 2 real roots, one 
with x; < 0 and another with x, > 0. 
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3.3 Duality 
In this section m = n. Let us define the notion of Bezoutian matrix that will 
be useful in the following. 


Definition 3.3.1. The Bezoutian Of,,...¢, of fo,..-, fn € Ris the polynomial 


fo(x) 91(fo)(x,y) be On( fo) (x,y) 
O faicauhes x y)= : ; : : € Kix, y], 


fn() O1(fr)(&Y) +++ On( fn) (X,Y) 


6:(f;) (x,y) = fiy a 
d a 


Set Of,,....4,09) = Vas do,ax°y” with aa.g € K, we order the monomials 


pene 


of fo,.--5 fn- 


The Bezoutian was initially used by E. Bézout to construct the resultant of 
two polynomials in one variable [Béz64]. 

When fo is the constant 1 and f is the polynomial map (f1,..., fn), the 
me f, Will be denoted by Af. 

We will define the residue ty associated to f = (f1,..., f,) and we will 
give some of its important properties (for more details see [SS75], [Kun86], 
[EM96], [BCRS96], also Chapter 1 of this book). 

The dual A of the vector space A has a natural structure of A-module: If 
(a, A) € Ax A, the linear form a.A:b € A (a.A)(b) := A(ab). 


Definition 3.3.2. The finite K-algebra A is called Gorenstein if the A- 
modules A and A are isomorphic. 


Set Ay = 0, 40,8 X%y” with ao, € K, we define the linear map 
As”: RoR 


Ars Ay? (A) = we Qa, Aty®) x" 


This map induces naturally a linear one also denoted by A,” : A — A. Since 
the number of polynomials m is equal to the number n of variables and the 
affine variety Z(Z) is finite, one can prove that A,” is an isomorphism of 
A-modules (see [$575], [Kun86], [EM96], [BCRS96]). Then A is a Gorenstein 
algebra. Thus we can state the following definition: 


Definition 3.3.3. The residue ty of f = (fi,..-, fn) is the linear form on R 
such that 


142 M. Elkadi & B. Mourrain 


1. t7(h) =0,WheE I, 
2. As’ (r7) -lel. 


In the univariate case, let f = fyx?+--++ fo be a polynomial of degree d. For 
he Rlet r=rq_\u*!+---+79 be the remainder in the Euclidean division 
of h by f, then 


— Td=1 
rr(h) = 7 (3.2) 


In the multivariate case, if for each i= 1,...,n, f; depends only on x;, then 


Pe bach) = TAeT twos eG”) (3.3) 


If the roots of fi = --- = fn = 0 are simple (this is equivalent to the fact 


that the Jacobian of f, denoted by Jac(f), does not vanish on Z(I)), then 
1 
Tf = Licez n Tat: 
But in the general multivariate setting the situation is more complicated. 
We will show how to compute effectively 7 for an arbitrary map f. 


An important tool in the duality theory is the transformation law. 


Proposition 3.3.4. (Classical transformation law) 
Let g = (g1,---;9n) be another polynomial map such that the variety de- 
fined by 91,---59n is finite and 


Vt@=H1oe. jn , => aa with aij © Ka]. 


j= 
Then Tf = det(ai,;) - Tg. 


Proposition 3.3.5. (Generalized transformation law [BY99, EM96]). 

Let (fo,---;fn) and (go,---,9n) be two maps of Kio, x] = K{ao, 71, .-., &n] 
which define finite affine varieties. We assume that fy = go and there are pos- 
itive integers m; and polynomials a;,; such that 


n 
* MM | = i. ‘ 
Vie lawn 5 fo —-, Gig Fi. 
jel 


Then T(fo,...,fn) = det(ai,;) - Tegmite tmnt 


gsiag 


Gi-Gn)* 


If fo = 2% and m, =--: =m, = 0, the generalized transformation law reduces 
to the classical one. 
Another important fact in this theory is the following formula: 


Jac(f)+T, =Tr, (3.4) 


where Tr: a € R+ Tr(a) € K (Tr(a) is the trace of the endomorphism of 
multiplication by a in the vector space A). If the characteristic of K is 0, we 
deduce from this formula that dimg(A) = 7; (Jac(f)). 
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3.3.1 Residue calculus 


The effective construction of the residue of the polynomial map f = (f1,..-, fn) 
is based on the computation of algebraic relations between /f1,..., fn and the 
coordinate functions x; (see also Section 1.5.4 of Chapter 1). We give here a 
method using Bezoutian matrices to get them. 

Let fo,---, fn be n+1 elements of R such that the n polynomials fi,..-., fn 
are algebraically independent over K. For algebraic dimension reasons there 
is a nonzero polynomial P such that P(fo,..., fn) = 0. We will show how to 
find such a P by means of the Bezoutian matrix. 


Proposition 3.3.6. (see /EM00/) Let u = (uo,...,Un) be new parameters. 
Then every nonzero maximal minor P(uo,...,Un) of the Bezoutian matrix 
of the elements fo — uo,---,;fn — Un in Kluo,...,Un][x] satisfies the identity 


P(fo,43 25 fn) = 0. 


This proposition comes from the fact that we can write the Bezoutian ma- 
trix of fo — uo,.--,;fn — Un (up to invertible matrices with coefficients in 
K(u1,.--,Un)) as 


Mf — uol 0 


where I is the identity matrix, My, is the matrix of multiplication by fo in the 
vector space K(u1,...,Un)[x]/(fi — wi,---,; fn — Un). By Cayley-Hamilton’s 
theorem every maximal minor of this Bezoutian matrix gives an algebraic 
relation between fo,..., fn (for more details see [EM00]). 

In practice, we use a fraction free Gaussian elimination (Bareiss method) 
in order to find a nonzero maximal minor of the Bezoutian matrix (see the 
implementation of the function melim in the MULTIRES package). 

We will see now how to compute effectively the residue Ty. 


Proposition 3.3.7. Fori € {1,...,n}, let 
Piles soos tg) = By pan ite) UH os ty oss jsut) 


be an algebraic relation between xj, fi,..., fn. If for each i there is ky € 
{0,...,m; — 1} such that a;,x,(0) 4 0, then for h € R the computation of 
the multivariate residue Ty(h) reduces to univariate residue calculus. 


Proof. If 7; = min{k : a;,,(0) 4 0}, we have 


gil) = ai,g,(O)ay"% +--+ + dim (0) = > Ais fi» Aig € K[z]. 


jal 
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By the transformation law and (3.3) there are scalars cq such that 


Te(h) = T (gr j-.-s9n) (2 det(Ai,;)) — S Cat (ty Vecotg GG) 


If w are formal parameters, similarly for every h € R, Tf_w(h) is 
a rational function in w whose denominator is the product of powers of 
a1,0(W),.--,@n,o(w). But it is not clear how to recover Tf(h) from this func- 
tion. For an arbitrary map f, Ts(h) can be computed using the generalized 
transformation law. 


For (a1,...,@m) € K” and a new variable x9, we define the multi-index 
m =(mj4,...,™M,) and the polynomials R;, S; as follows: If P;(uo,..., tn) is 
an algebraic relation between 2;, fi,..., fn, then there are B;,; € K[xo,..., 2] 
such that 
P;(xj,1%0,---,;AnX0) = SG — aj%o)Bi; (3.6) 
j=l 
=> ie (Ri (xi) = LoSi(Xi, x0)). (3.7) 


From the transformation laws we deduce the following result: 


Proposition 3.3.8. If for eachi=1...n, the univariate polynomial R; does 
not vanish identically, then for h € R we have 


Tp(h) = 5 TeqlmIFI-Il pki tt pin tty (Sp cee Skn hdet(B;,;)). 
kEN”:|k|<|m| 


Proof. From (3.6) and Proposition 3.3.5, we have 


Tp(h) = T(xo,fi—a1x0 hi aay) = Tet a ea Bh R,—ao,) (2 det(Bi,;)). 


Using the identities 


|m| 
RI" #9 — (a9Sj)lml+1 = (Ry — 205;) S> RI" (ap S,)* 6 = 1.01, 
k,=0 


and Proposition 3.3.4 we deduce the formula in Proposition 3.3.8. 


Propositions 3.3.6 and 3.3.8 give an effective algorithm to compute the 
residue of a map in the multivariate setting. They reduce the multivariate 
residue calculus to the univariate one. 


We will show how to use the residue for solving polynomial systems. Let 
¢1,--+,¢p be the solutions of the system f = 0 (each solution appears as many 
times as its multiplicity). Let us fix 7 € {1,...,n}. Using formula (3.4) and 
Theorem 3.2.5, we can compute the Newton sums 
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Si = Tf (a;/ Jac(f)) = Tr(2;’) = Gi? ss a 6p, i? 


where ¢1,;,-.-,¢p,; are the i-th coordinates of ¢),...,¢p. If o1,...,7p are the 
elementary symmetric functions of ¢1;,...,¢p, (ie. 07 = ikipe<is<D Ciiy 
we can obtain the univariate polynomial ~ ~ 


A(T) =(T-G4)...(T-G@pa) =TP + aT? 1 +--+ +0p 
by means of the Newton identities: 
kop = —Se — 01Sp-1 — ++ — OR-151,1<k< D. (3.8) 


The residue ry allows us to find the univariate polynomials A;(T),1<i< 1, 
and then to deduce the i-th coordinates of the roots of the system f; = 
fn = 9. 

For other applications of residue theory see [EM98, EM]. 


3.4 Resultant constructions 


Projection is one of the most used operations in effective algebraic geometry 
[Eis95, CLO98}. It reduces the dimension of the problem that we have to solve 
and often simplifies it. The resultant is a tool to perform such a projection and 
has many applications in this domain. It leads to efficient methods for solving 
polynomial equations based on matrix formulations [EM99b]. We present here 
different notions of resultants (see also Chapter 1). 

We recall that a resultant of a polynomial system f, on a complete variety 
X is a polynomial Resx(f-) on the coefficients ¢ of this system (considered 
as variables) such that the vanishing of Resx (f.) is a necessary and sufficient 
condition for f, to have a solution in the variety X. The best known formu- 
lation of the resultant is in the case of two univariate polynomials. It is given 
by the Sylvester matrix. Another classical one is the projective resultant of n 
homogeneous polynomials in n variables. It can be computed using Macaulay 
matrices (see Chapter 2, Section 2.3, or [DD01]). Recently a refined notion 
of resultants (on toric varieties) has been studied. It takes into account the 
actual monomials appearing in the polynomials. Its construction follows the 
same process as in the projective case except that the notion of degree is re- 
placed by the support of a polynomial (for more details see Chapter 7). Here 
we will focus on an even more recent generalization of these resultant notions. 


3.4.1 Resultant over a unirational variety 


A natural extension of the toric resultant is to replace the monomial parame- 
terization by a rational one. The polynomial system f, is defined on an open 
subset of K” and is of the form 


+ CD,ig)s 
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fo(t) = Bee €o,; Ko,j(t) 
f= (3.9) 


F(t) = D920 Ong Kn, (t) 


where t = (t1,...,t,) € K” and the «;,; are nonzero rational functions, which 
we can assume to be polynomials by reduction to the same denominator. 
Let K; = (#i,;);=0,...,4, and U be the open subset of K” such that K;(t) 4 0 


anny: 


on U fori =0,...,n. Assume that there exists o9,..., on € R defining a map 
a2 = PX 
t > (o0(t) : +++: on(t)), 
and homogeneous polynomials ;,;(%o,...,@N), 7 = 0,...,n, 9 = 0,..., ki, 
satisfying 


Ki,j (t) = Wig (a0(t), seng on(t)) and deg(wi,;) = deg (w,0) > 1. 


Let X° be the image of o and X be its closure in P’. In order to construct 
the resultant associated to the system (3.9) on the variety X we assume the 
following conditions (D): 


bes n is of rank n at one point of U, 


(D1) The Jacobian matrix of o = (0;)i=0, 
(D2) For generic c, f; =--: = fy, = 0 has a finite number of solutions in U. 


We will show that these conditions are sufficient to define the resultant. Let 
U° = {t €U: Kj o(t) £0 for i =0,...,n} be the dense open subset of U and 
consider the parameterization 


7: Pro-l x... x Phn-1 x Yo — Pho x... x Phn x PN 
(€o,.--,€n,t) b> (Cosa de tng Ot) 


o ~ ky 
with c= (Ci,0, Ci) and Ci,0 = aT) ae Ci,j Ki,9(t). We denote by W° the 
image of 7, W its closure in P*o x --. x Pk» x PN, 7, : PRo x... x Pkn x PN — 
Pro x... x Pn, and m2 : PPO x -++ x Pn x PN — P% the canonical projections. 


Theorem 3.4.1. Under the conditions (D), the variety W is irreducible and 
projects onto a hypersurface Z = 1(W). Moreover if Resx (fc) is one equation 
of Z, for any specialization of the parameters c = (c;,;), Resx (fe) = 0 if and 
only if there exists (c,x) € W such that f;(x) := Ss i,j Wij (@) = 0 for 


t=O, aay 


Proof. The variety W is the closure of a parameterized variety, so it is irre- 
ducible and its projection Z is also irreducible. 

According to (D1), the Jacobian of o is of rank n on an open subset of 
U. This implies that the dimension of the variety X is n. The fibers of the 
projection 72: W° — X° are linear spaces of dimension }7)"_, ki — n — 1, for 
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we have K;(t) 4 0 when t € U. By the fiber theorem ([Sha77] or [Har95]), we 
deduce that W is of dimension )>}" 9 kj — 1. 

Consider now the restriction of 7; to W°. According to (D2), there exists 
an open subset of P*o x --. x P*» on which the number of solutions of the 
system f; = --- = fn = 0 is finite. The fibers of 7, on this open subset is 
therefore of dimension 0. This shows that the projection 7(W°), and thus 
Z, is of the same dimension as W, that is a hypersurface of P*° x --. x Ph» 
defined (up to a scalar) by one equation Resx (fe). 

As the fibers of 72 above X° are of dimension )7j'_,k; —n—1 and W 
is of dimension )>j"_y ki — 1, 72(W) is an irreducible variety of dimension n 
containing X°. This shows that X = 72(W). Consequently for a specialization 
of the coefficients c, Resx(f-) = 0 iff there exists x € X such that (c,x) € W, 
ie. fi(x) =0 for? =0,...,n. 


The degree of the resultant Resx(f.) in the coefficients c;,; of fj is 
bounded by (but not necessarily equal to) the generic number of points of 
V; = Z(fo,---,fi-1, fitt, ---;fn) OX. In the case where the linear forms 
f(O, ¢-€ Vi, in c,;, are all distinct, the degree of Resx(f.) in the coeffi- 
cients of f; is exactly the number of generic roots of V;. This is the case when 
t1,...,tn appear among the K;,;,7 = 0,...,4;, as it is illustrated below. 

We can compute a non-trivial multiple of Resx(f-) using the Bezoutian 
matrix. 


Theorem 3.4.2. Assume that the conditions (D) are satisfied. Then any 
maximal minor of the Bezoutian matrix By,,....f, is divisible by Resx (fe). 


inalbag 


This theorem is a consequence of hypotheses (D) and the fact that if the 
variety defined by f,,..., fn is finite then the Bezoutian of fo,..., fn admits 
a block decomposition of the form (3.5), for more details see [BEMO0]. 


Example 3.4.3. Consider the three following polynomials: 


fo = co,0 + Co,1t1 + Co,2t2 + €0,3(t1? + te?) 
fi =ci0 + criti + crete + c1,3(t1? + te”) + c1,4(t1? + to)? 
fo = €2,0 + C2,1t1 + €2,2t2 + €2,3(t1? + to”) + C2,4(t1? + te)”. 


We are looking for conditions on the coefficients c;,; such that these three 
elements have a common “root”. The projective resultant of these polynomials 
in P? is zero (for all the values of parameters c;,;), because the corresponding 
homogenized polynomials vanish at the points (0: 1: i) and (0: 1 : —i)). 
The toric resultant also vanishes (these polynomials have common roots in 
the associated toric variety). Now we consider the map 


o:K* > P? 
(41,42) O (Lith: ft: +4). 


The rank of the Jacobian matrix of o is 2 and 
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Wo = (@0,41,22,23) , Yi = bo = (22, 2021, 20%, C03, 0%) , 


where (a9 : 1 : 22 : £3) are the homogeneous coordinates in P?. We have 
fi = ci; ¥i,3 9 o for i = 0,1, 2. For generic values of the coefficients c;,;, the 
system f, = fo = 0 has a finite number of solutions in K?. By Theorem 3.4.2, 
any nonzero maximal minor of By, ,f,,, is divisible by Resx(fo, f1, fe). 


> mbezout ([f1,£2,£3], [t1,t2]); 


The Bezoutian matrix of f1, fo, fz is of size 12x 12 and has rank 10. A maximal 
minor is a huge polynomial in (c¢;,;) containing 207805 monomials. It can be 
factored as q1q2(q3)?p, with 


dM = —€0,2C1,3C2,4 + Co,2€1,4€2,3 + C1,2C0,3C2,4 — C2,2€0,3C1,4 

gz = €0,1€1,3€2,4 — €0,1€1,4€2,3 — €1,1C0,3C2,4 + €2,1C0,3C1,4 

93 = €0,37C1,17C2,47 > 20,37C1,1C2,1C2,4C1,4 + €0,37C2,47C1,27 apts 

p= €2,0°C1,44C0,24 =F €2,0°C1,4*C0,14 =F €1,04C2,4*C0,24 + €1,04C2,4*C0,14 ape 


The polynomials gz; and p contain respectively 20 and 2495 monomials. 
As for generic equations fo, f1, f2, the number of points in the varieties 
Z(fo, fi), Z(fo, fe), Z(f1, fo) is 4 (see for instance [Mou96]), the resultant 
Resx(fo, f1, f2) is homogeneous of degree 4 in the coefficients of each f;. 
Thus, Resx(fo, fi, f2) is equal to the factor p. 


3.4.2 Residual resultant 


In practical situations the equations have common zeroes which are inde- 
pendent of the parameters of the problem. These ”degenerate” zeroes are 
not interesting for the resolution of this problem. We present here a resul- 
tant construction which allows us to remove these degenerate solutions when 
they form a complete intersection [BEMO01] (for more details see [BEMO1], 
[BKM90, CU02, BusO1a}). 

We denote by S' (resp. S, for vy € N) the set of homogeneous polynomials 
(resp. of degree v) in the variables xo,..., 2, with coefficients in K. 

Let g1,.--,;gr be r (with r < n+ 1) homogeneous polynomials in S of 
degree kj >--- >k,, and let do >--- > d, be n+ 1 integers such that d, > 
max(k,k, +1). We assume that G = (g1,...,gr) is a complete intersection 
and we consider the system 


fo(x) = Wier hi,o() gi(x) 
{3= : 


Fr(x) = Liar hin (x) gi) 


where hi,j(x) = Vjqj—a,—% Co” X* is the generic homogeneous polynomial of 
degree d; — k;. We look for a condition on the coefficients c = (c’) such that 
f, has a solution “outside” the variety defined by G. Such a condition is given 
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by the residual resultant defined in [BEMO1]. This resultant is constructed as 
a resultant over the blow-up 7 : X > X =P" of ® along the coherent sheaf 
of ideals G associated to G ({Har83]). 

If G is the sheaf on X inverse image of G by m and Gg, =G@a *(Ox(di)), 
the degree of the residual resultant in the coefficients of each f; is Nj = 
ell isi c1( (Ga; ), with cy (Ga; ) is the first Chern class of Ga;. Using intersection 
theory [Eul9s}, we can give an explicit formula for N; if G is a complete 
intersection. More precisely we have: 


Theorem 3.4.4. [BEM01] There exists an irreducible and homogeneous poly- 
nomial Res@ido,....d, i Kile] which satisfies 


5a 


Res@do ses dy (fos. : sty) =0s Z(F : G) x 0. 


Moreover, if for a fixed 7 € {0,...,n} we denote by d the n-tuple d = 
(do, ..-,dj-1,dj41,---,dn) , go(d)=(-1)", o(d)=(-)" 7 Vij ad , 
o2(d) = (-1)" De ep jotinia cia Unde » <-> 

On(d) = Thiz; dh,r5(L) = on(d) + o, Pn—1(d) T", and 


r3(y1) +++ Ty(Yr) 


Y1 Paes Ur 
Pj (y1, +++, Yr) = det ; 
ey 


The degree of Res@ do in the coefficients of each polynomial f; is 


greed dn 


P,. 
: (k1,...,Kp). 


Nj= 5 


peseg 


a 


ahomdine to the columns 71,...,%;, (€0,---,@n) and (é,.. sd é) be two bases 
of the S-module $”+!. A matrix whose determinant is a oe multiple 
of Res@ido,...,d, Can be constructed using the following result: 


young 


Theorem 3.4.5. /BEMO01] For v > vax = vig di — 2 — (n— 7 + 2)ky, the 
map 


Ov: ( Q Sy dj, —-- di, FD, k,Ciy Netwds ci) QD (® 5,-4.1) — Sr 


OS11 <...<ip Sn 
€i4, \..- A €i, > Aiy..i, 
é&i — fi 
is surjective if and only if Z(F': G) =. In this case, every nonzero maximal 


minors of size dimg(S,) of the matrix of 0, is a multiple of ResGjdo,...,d, > and 
the gcd of all these minors is exactly the residual resultant. 


gebey 


150 M. Elkadi & B. Mourrain 


This result is based on the resolution of the ideal (( Fosxsea tn) G) given 
in [BKM90]. 


Example 3.4.6. (The residual of two points in P?). We consider the following 
system in P?: 


= 2 2 2 
fo = aoXH + a4 XX] + AQXQXy + a3(xj + £5) 
fi => boxe + by x0 X1 al boro X2 + b3 (a? + x2) 
fo = Cove + C1201 + CoTOX2 + €3(x7 + V3). 


If G = (xo, 2] + v3), Vax = 2 and a nonzero maximal minor of the matrix of 


O, is 


ao bo Co 0 0 0 
0 0 0 —byc3 + €1b3 —bec3 + c2b3 —c1a3 + a1C3 
ay by cy 0 —c3bo + b3c9 0 
C2 be Cg —C3b9 + b3¢9 0 aoc3 — Cpag3 
a3 b3 C3 0 —b c3 + c1b3 0 
ag b3 cz —bec3 + Cob3 0 —c2a3 + a2C3 


The formula for the degrees gives No = Ni = No = 2 and we check that 
this minor is the residual resultant times c3(c1b3 — c3b1). It has the minimal 
degree No in the coefficients of fo. In this example the projective and toric 
resultants vanish identically. 


Example 3.4.7. (The residual of a curve in P*). We consider the following 
system of cubics in P® containing the umbilic: 


fo = (aoxo + aiz1 + a2x2 + a3r3)(zp + 2 + 23) + (asap + asx} + agx3 + a7r3t 
agLoL1 + agXoxL2 + Ai0L0L3 + A1121L2 + A12%1%3 + G13X%2U3)x3 

fi = (boro + bia + boxe + b3x3)(a§ + vi + 23) + (bard + bsxi + bexs + br x3 
bgto0%1 + boxox2 + bin%o%3 + b11 4122 + 124143 + b13%2%3) x3 

fe = (covo + c141 + C2%2 63£3) (x +22 +43) + (caxe csv; + cox 4 c7a3+ 
CgXoX1 + CoLOX2 + C10oLOX3 + €11X%1X2Q + €12%143 + C13%2%3)x3 

fs => (doxo } d,x1 | dzx2 | d3x3)(x3 | x2 x3) | (dx | ds? + dgx2 | d7x | 
dgxox1 + doxoxr2 + dioxox3 + diix1x2 + di2x1x%3 + di3x2%3)x3 


Let G = (23,22 + x? + x3). The previous construction gives No = Ni = 
Nz = N3 = 15. The size of the matrix M, of 0, is a 84 x 200. A maximal 
minor of rank 84 whose determinant has degree 15 in the coefficients of fo has 
been constructed as follows. We extract from M, 69 independent columns (by 
considering a random specialization). We add to this submatrix the columns 
of M, depending on the coefficients of fp and independent of the 69 columns, 
in order to get a 84 x 84 matrix with a nonzero determinant. It yields a 
nonzero multiple of the residual resultant. Notice that the projective and 
toric resultants are identically 0 in this example. 
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3.5 Geometric solvers 


Let us describe now how to exploit the resultant constructions to solve poly- 
nomial systems. 


3.5.1 Multiplicative structure 


Mio|Mi1 
defined in Section 2.3 of Chapter 2. Here, we use the natural convention that 
the columns of the resultant matrices represent multivariate polynomials. 
Theorem 3.5.1. /PS96, ER94, MP00, CLO98] For generic systems fi,..., fn; 
the matriz of multiplication by fo in the basis 


Moo |M : 
Let fo,---, fn € R and Mo = Cane be the transpose of the matrix 


xo = {xp°...02" :0< a < deg fi,i=1,...,n} 


of A= R/(fi,---, fn) is the Schur complement of M11 in Mo, namely My, = 
Moo — MoiM11~*Mio. 


Proof. (see also proof of Theorem 2.3.2 of Chapter 2) Since x”° is a basis of 
the quotient by the polynomials ao ..., 2%, it remains a basis for generic 
polynomials f;,..., fn of degree d,,...,dy. 

In order to compute the matrix of My, in this basis, we have first to 
multiply the elements of the basis by fo. This is represented in a matrix form 


Mo 


by the block Co := e ) . Then we have to reduce these polynomials in terms 
10 


of the basis x”° by multiples of polynomials f,,..., fn. The multiples that 


: ? M : 
we use are represented by the coefficient matrix C, := rs ) . The reduction 
11 


corresponds to the matrix operation Cp — C, My Mio which yields the block 


Myo *= Moo — MotMi1 ‘Mio. 


(9) 


Example 3.1.3 (continued). The matrix Mo associated to the polynomials f1, fo 
of example 3.1.3, and a generic linear form fo = up + 121 + U2@Q is: 


> M_O := mresultant([u[0]+u[1] *x[1]+u[2]*x[2] ,f1,£2],[x[1],x[2]]); 


u 00 0/0 0 2/0 0 -§ 
ubz uy 0 02 0 -8/0 —% O 
ub; 0 uy Of} 0 2 -8/-% 0 -1 
O uy Ug Ug/—8 —8 8 -1 1 
‘ice | 2 ae S130 os 
o"" | 0 wd 0|-8 0 4/0 0 0 
0000/0 13 0)1 0 0 
0000/4 0 0]0 0 0 
00 0mj13 8 O]1 1 0 
00 0wm/8 4 0/0 1 0 
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In this example a basis of A is So = {1,21, 22,7122}. The Schur complement 
Moo — MoiMii~'Mio of Mi in Mo is the 4 x 4 matrix: 


> M(u):= uschur(M_0O,4); 


25 E 5 5 
Uo ~97 U2 gUl Bq U1 54 U2 
2 
U2 Up + 2uU2 0 a7 U1 + 3q U2 
M(u) = 5 55 55 
Uy —q U2 Uo + Uy 5a Ul — 54 U2 
0 U1 + Fue U2 U1 UO uy +2 U2 


By Theorem 3.5.1, the coefficient of u; in M(w) is the matrix of the operator 
Mz,. 

An advantage of this approach is that we have a direct matrix representa- 
tion of the multiplication operator without using an algorithm to compute a 
normal form in A. This formula is a continuous function of the coefficients of 
input polynomials in the open set of systems such that M,; is invertible. Thus 
it can be used with approximated coefficients, which is useful in many practi- 
cal applications. However the main drawback is that the size of the matrix Mo 
increases very quickly with the number of variables. One way to tackle this 
problem consists in exploiting the structure of the matrices (i.e. their sparsity 
and quasi-Toeplitz structure) as described in [MP00, BMP00]. Another way 
to handle it and to keep a continuous representation of the matrix of multipli- 
cation has been proposed in [MTO0]. In some sense, it combines the previous 
resultant approach with the normal form method proposed in section 3.1.4, 
replacing the computation of a big Schur complement Moo — MoiM11~!Mio by 
the inversion of much smaller systems. 

In the next table, we compare the size of different systems to invert (first 
lines) with the size m of the matrix Mj; to invert in Macaulay’s formulation, 
in the case of projective resultants of quadrics (d; = 2) in P”. Here D is the 
Bézout bound or the dimension of the K-vector space A. 


n| 5 6 7 8 9 10 ali 
5 6 if 8 9 10 11 
20 30 42 56 72 90 110 


30 60 105 = 168 252 360 495 
20 60 140 280 504 840 1320 
5 30 105 280 630 1260 2310 
6 42 168 504. 1260 2772 

7 56 252 840 2310 

8 72 360 1320 


9 90 495 
10 110 
11 


80 192 448 1024 2304 5120 11264 
430 1652 6307 24054 91866 351692 1350030 
32 64 128 256 512 1024 2048 


OL Bis 


3 Symbolic-numeric solvers 153 
3.5.2 Solving by hiding a variable 


Another approach to solve a system of polynomial equations consists in hiding 
a variable (that is, in considering one of the variables as a parameter), and 
in searching the values of this hidden variable for which the system has a 
solution. Typically, if we have n equations f; = 0,..., fn = 0 in n variables, we 
“hide” a variable, say x,, and apply one of resultant constructions described 
before to the overdetermined system f; = 0,..., fn = 0 in the n—1 variables 
X1,--.-,%n—1 and a parameter 2,,. This leads to a resultant matrix S(z,,) with 
polynomial entries in z,,. It can be decomposed as 


S(@,) = Sgt +Sy_ic7 1 +--+ +p, 


where S; has coefficients in K and the same size than S(a,,). We look for the 
values ¢,, of a, for which the system has a solution ¢’ = (¢1,.-.,C,—1) in the 
corresponding variety X’ (of dimension n — 1) associated with the resultant 
formulation. This implies that 


v(¢/)* S(Gn) = 9, (3.10) 


where v(¢’) is the vector of monomials indexing the rows of S evaluated at 
¢’. Conversely, for generic systems of the corresponding resultant formulation 
there is only one point ¢’ above the value ¢,. Thus the vectors v satisfying 
S(¢n)* v = 0 are scalar multiples of v(¢’). From the entries of these vectors, 
we can deduce the other coordinates of the point ¢’. This will be assumed 
hereafter®. 

The relation (3.10) implies that v(¢’) is a generalized eigenvector of $'(x,,). 
Computing such vectors can be transformed into the following linear general- 
ized eigenproblem 


0 I 0 I O 0 

see ee 2 w=0. (3.11) 
0 0 61 eT 
Shh aes Boe 4 Oe 0 =F 


The set of eigenvalues of (3.11) contains the values of ¢,, for which (3.10) 
has a solution. The corresponding eigenvectors w are decomposed as w = 
(wWo,-.--,Wa-1) so that the solution vector v(¢’) of (3.10) is 


v(C!) = wot Gwi te + G0 twa-1- 


This yields the following algorithm: 


° Notice however that this genericity condition can be relaxed by using duality, 
in order to compute the points ¢’ above ¢, (when they form a zero-dimensional 
fiber) from the eigenspace of S(Cn). 
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Algorithm 3.5.2 SOLVING BY HIDING A VARIABLE. 


INPUT: fi,---,fn € R. 


1. Construct the resultant matrix S(ay) of fi,.--,fn (as polynomials in 
L1,---,Xn—-1, with coefficients in K[x,]) adapted to the geometry of the 
problem. 

2. Solve the generalized eigenproblem S(xp) v = 0. 

3. Deduce the coordinates of roots ¢ = (G1,.--,¢n) of fi =--- = fn =9. 

OUTPUT: The roots of fy =--: = fn =0. 

Here again, we reduce the resolution of f; = 0,..., fn = 0 to an eigenvector 

problem. 


Example 3.5.3. We illustrate this algorithm on the system 


fi = 2142+ 23-2 
fo = 01723 + 24223 —3 
fs = 1% 2X2 x9" + 0923 — 2103 —2. 


We hide x3 and use the projective resultant formulation (see Section 2.3 in 
Chapter 2). We obtain a 15 x 15 matrix S(#3), and compute its determinant: 


> S:=mresultant ([f1,£2,£3], [t1,t2]):det(S); 


det (S$) := a3* (x3 — 1) (223° — 11 234 + 20.23° — 10.237 + 10.23 — 27) . 


The root 23 = 0 does not yield an affine root of the system f; = fo = fs =0 
(the corresponding point is at infinity). Substituting 73 = 1 in S(a3), we get 
a matrix of rank 14. The kernel of S(1)* is generated by 


(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1). 


? 9 ? o] ? ? ? | % ? ? ? ? ? 


This implies that the corresponding root is (1,1, 1). For the other eigenvalues 
(which are the roots of the last factor in det(S)), we proceed similarly in order 
to obtain the 5 other (simple) roots of f; = fo = f3 = 0. Here are numerical 
approximation of these roots: 


0.511793 — 1.276711, 0.037441 + 1.92488 i, —0.476671 — 0.937337 i), 
0.511793 + 1.276711, 0.037441 — 1.92488 i, —0.476671 + 0.937337 i), 
1.38186 + 0.699017 i, —0.171994 + 0.704698 i, 2.25492 + 1.09402 i), 
1.38186 — 0.699017 i, —0.171994 — 0.704698 i, 2.25492 — 1.09402 i), 
0.0734678, 0.769107, 1.9435). 


( 
( 
(- 
(- 
( 

3.5.3 Isolated points from resultant matrices 


In this section, we consider n equations f1,..., fn in nm unknowns, but we do 
not assume necessarily that they define a finite affine variety Z(fi,..., fn). 


3 Symbolic-numeric solvers 155 


We are interested in computing a rational univariate representation of the 
isolated points of this variety. We denote by Jp the intersection of the primary 
components of I = (f1,..., fn) corresponding to isolated points of Z(I) and 
Zo = Z(Ip). We denote by Co(u) the Chow form associated to the ideal Ig 
(see Section 3.2.3). 

First we consider that J = Jp. Let fo = up +uya1+---+Untn bea 
generic affine form (the u; are considered as variables). We choose one of the 
previous resultant constructions for fo,..., fn which yields a matrix 


such that My, is invertible (if it exists). The blocks Mo9,Mi9 depend only on 
the coefficients of fo. From Section 3.5.1 and according to the relation 


& >) ( I -) _ we — MoiMy' Mio — 

Mio Mii —My/Mio I) 0 Mi 

we deduce that det(Mo) = det(My,) det(Mi1). This means that det(Mo) is a scalar 
multiple of the Chow form of the ideal J. Such a construction applies for a 
system which is generic for one of the mentioned resultant formulations. We 
can obtain a rational univariate representation of Z(I) applying Algorithm 
3.2.10. 

If the affine variety Z(J) is not finite, we can still deduce a rational uni- 
variate representation of the isolated points from the previous resultant con- 
struction in (at least) two ways. 

When the system is not generic for a given construction, a perturbation 
technique can be used. Introducing a new parameter € and considering a 
perturbed system f, (for instance f. = f +e fo), we obtain a resultant matrix 
S.(u) whose determinant is of the form 


A(u, €) = e*Ay(u) + eF*1Apii(u) +++: with A, £0. 


It can be shown that A;(u) is a multiple of the Chow form of Jp. Applying 
Algorithm 3.2.10 to this multiple of the Chow form yields a rational univariate 
representation of Zp (see [Gri86, Chi86, Can90, GH91, LL91] for more details). 

The use of a new parameter € has a cost that we want to remove. This can 
be done by exploiting the properties of the Bezoutian matrix. 


Proposition 3.5.4. /EM99a, BEM00] Any nonzero maximal minor A(u) of 
the Bezoutian matrix of polynomials fo = up + Uyx, +++: + Untn, fi,---5 fn 
is divisible by the Chow form Co(u) of the isolated points of I = (fi,..., fn)- 


The interesting point here is that we get directly the Chow form of the isolated 
points of Z(/) even if this variety is not finite. In other words, we do not need 
to perturb the system for computing a multiple of Co(u). Another advantage 
of this approach is that it yields an “explicit” formulation for A(u), and its 
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structure can be handled more carefully (for instance, by working directly on 
the matrix form instead of dealing with the expansion of minors). So we have 
the following algorithm: 


Algorithm 3.5.5 RATIONAL UNIVARIATE REPRESENTATION OF THE ISO- 
LATED POINTS. 


INPUT : fi,---,fn € Kiai,.--,%n] 


1. Compute a nonzero multiple A(u) of the Chow form of fi,..-,fn, from 
an adapted resultant formulation of fo = Uo ture, +-+++Untn, fi,---5 fn 
(for instance using the Bezoutian matrix). 

2. Get a rational univariate representation of the isolated (and maybe some 
embedded) roots of f, =--- = fn =0 by applying Algorithm 3.2.10. 


In practice, instead of expanding completely the polynomial d(t + u) in Algo- 
rithm 3.2.10, it would be advantageous to consider u1,...,Un as infinitesimal 
numbers (i.e. u? = uju; = 0 for i,j = 1,...,n) in order to get only the first 
terms do(uo) + u1di(uo) +--+: + tUndn(uo) of the expansion of d(t + u). More- 
over, we can describe these terms as sums of determinants of matrices deduced 
from resultant matrices. This allows us to use fast interpolation methods to 
compute efficiently do(uo),...,dn (uo). 


3.5.4 Solving overdetermined systems 


In many problems (such as in reconstruction in computer vision, autocalibra- 
tion in robotics, identification of sources in signal processing, ...), each ob- 
servation yields an equation. Thus, we can generate as many (approximated) 
equations as we want but usually only one solution is of (physical) interest. 
Thus we are dealing with overconstrained systems which have approximate 
coefficients (due to measurement errors for instance). 

Here again we are interested in matrix methods which allow us to handle 
systems with approximate coefficients. The methods of the previous sections 
for the construction of resultant matrices Mj admit natural generalizations 
{[Laz77] to overconstrained systems, that is, to systems of equations f, = 
... = fm = 0, with m > n, defining a finite number of roots. We consider a 
map of the form 


Fike aeP 
(G1y--654m) > >> fig 
i=1 


where V and VY; are linear subspaces generated by monomials of R. This yields 
a rectangular matrix S. 

A case of special interest is when this matrix is of rank N —1, where N is 
the number of rows of S. In this case, it can be proved [EM] that Z(f1,..., fm) 
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is reduced to one point ¢ € K”, and if x” = (x)aer is the set of monomials 
indexing the rows of S that 


(Caer s=0. 
Using Cramer’s rule, we see that ¢*/¢% (a, 6 € F, ¢% #0) can be expressed 
as the ratio of two maximal minors of S. If 1,21,...,¢%, € x” (which is the 


case most of the time), we obtain ¢ as a rational function of maximal minors 
of S, and thus of input coefficients of f1,..., fm. 


Algorithm 3.5.6 SOLVING AN OVERCONSTRAINED SYSTEM DEFINING A 
SINGLE ROOT 


Input: A system fi,...,fm € Klai,...,%n] (with m > n) defining a single 
solution. 


1. Compute the resultant matriz S for one of the proposed resultant formu- 


lations. 
2. Compute the kernel of S and check that it is generated by one vector 
W = (WI, We, - +), Wa, ee): 


Wey Wen ) 
ae ee 


OUTPUT: ¢ = ( 
Let us illustrate this algorithm, with a projective resultant construction. 


Example 3.5.7. We consider the case of 3 conics: 


> f1:= x172-x1*x2+x272-3; 

> £2:= x1°2-2*x1*x2+x272+x1-x2; 

> £3:= x1*x2+x272-x1+2*x2-9; 

> S:=mresultant([f1,£2,£3], [x1,x2]); 


-3 0 00 0 0 0 0 00 0-90 0 0 
0-30 0 0 0-10 0 0-92 0 0 0 
0 0-30 0 0 1 0 0 0 0-10 0 -9 
-1 0 0-3 0-1-2 0 1 0-11-90 2 
0 1-10-1-20 1 1 0 0 0-12 1 
0-11 00 1 0-1-2-11 0 2 0 1 
0 0 0 1-20 01 000 0 0 1 =0 
S:=]| 0 0 0-11 0 0-20 0 0 0 1 1 0 
0 001 0 0 01 0 1 0 0 1 0 0 
100 0 01 1 0 0 0 0 0 0-9-1 
1 00 0 0 0 1 0-1-9 2 1 0 0 0 
0 01 01 1 0 00 0 0 0 0-1 0 
0 10 00 0 0 01 2 1 0 0 0 0 
0 00 01 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 


The rows of S are indexed by 


2 2 3 D290 3 2 2 3 3 4 4 
(1, %2,01,21%2, 01 2,01%Q°,X%1 ©Q,01 %2°,%1%Q , V1, XQ, X1 , XQ ,X1 , XM Vs 
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We compute the kernel of S* in order to check its rank and to deduce the 
common root ¢ of the system: 


> kernel (transpose (S)) ; 


{(1,2,1,2, 2, 4,2, 4,8, 1,4, 1,8, 1, 16)}. 
Considering the list of monomials which index the rows of S we deduce that 
¢ = (1,2). 


In case that the overdetermined system has more than one root, we can 
follow the same approach. We chose a subset FE of F (if possible containing 


the monomials 1,21,...,2%,) such that the rank of the matrix indexed by 
the monomials x’\” is the rank r = N — D of S. The set x” will be the 
basis of A. Assuming that the monomials 2;x”, i = 1,...,n, are also in 


x", we complete the matrix $ with the block of coefficients of fy x”°, where 


fo = uo tuy 21 +---+Up Z,. By a Schur complement computation, we deduce 
the matrix of multiplication by fo in the basis x” of A. Now, by applying the 
algorithms of Section 3.2.2, we deduce the roots of the overdetermined system 
fi,---; fm (see [EM99b] for more details on this approach). 


3.6 Applications 


We will use the tools and methods developed above to solve some problems 
coming from several areas of applications. 


3.6.1 Implicitization of a rational surface 
A rational surface (S') in K? may be represented by a parametric representa- 


tion: 
f(s, t) g(s,t) h(s, t) 


VEE ete oO aay. * aaa 
where f,g,h,d,,d2,d3 € K[s,t] or by an implicit equation (i.e. F € K[a, y, 2] 
of minimal degree satisfying F(a,b,c) = 0 for all (a,b,c) € (S)). These two 
representations are important for different reasons. For instance, the first one 
is useful for drawing (5S) and the second one to intersect surfaces or to decide 
whether a point is in (S$) or not. 

We will investigate the implicitization problem, that is the problem of 
converting a parametric representation of a rational surface into an implicit 
one. 

These last decades have witnessed a renewal of this problem motived by 
applications in computer-aided geometric design and geometric modelling 
([SAG84], [Buc88a], [Hof89], [Kal91], [CM92], [AGR95], [CGZO00], [ASO1], 
[(CGKW01)). Its solution is given by resultants, Grébner bases, moving sur- 
faces (see [SC95], [BCD03], [D’A01]). The techniques based on resultants and 
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moving surfaces fail in the presence of base points (i.e. common roots of 
f,g,h, di, dz, dz). The Grébner bases methods are fairly expensive in practice 
even if the dimension is small. Recently, methods using residual resultants and 
approximation complexes have been proposed but only under some restrictive 
geometric hypotheses on the zero-locus of base points which are difficult to 
verify ([BusO1b], [BJ03], [BC]). We propose an approach based on the residue 
calculus extending [GV97]. This method works in the presence of base points 
and no geometric hypotheses on the zero-locus of base points are needed. 


In order to find an implicit equation of (S), as in Proposition 3.3.6 we can 
compute a nonzero maximal minor of the Bezoutian matrix of polynomials 
ard, — f, ydz2 — g, zd3 — h with respect to s,t. In general, this yields a multiple 
of the implicit equation as shown below. 


Example 3.6.1. Let (S) be the surface parameterized by 


264+ 2t+s t? — 2st — 1 
tT=S , a > = — a 


The Bezoutian matrix of x — s,yt? — t?s — 2t — s,zt? — #2 + 2ts +1 in 
(K[x, y, 2])[s,t] is a 4 x 4 matrix. 


> melim([x*d1-f, y*d2-g,z*d3-h] ,[s,t]); 


(z —1)?(40* — 4a3y + 2? 2? — 8x? z + Qayz + 4x? + y? + 42 — 4). 


The second factor in this expression is the expected implicit equation. 

The use of the Bezoutian matrix produces an extraneous term along with 
the implicit equation. We will see how to use the residue calculus in order to 
remove it from this equation. 


Let us consider the polynomials in (K[z, y, z])[s, ¢] 

F(s,t) = di(s 8, = F (8,4) 
G(s, t) =Y dz(s Ss, i= g(s, t) 
H(s,#) =% d3(s 8, t) — h(s,t). 


Let Z = {¢ € K(y,z) : G(¢) = H(¢) = 0} = Z, U Za, where 2, is the alge- 


braic variety Zp Z(dideds) = {C € Ky,2) : G(0) = H(0) = didod3(¢) = 0} 
and Z2 = Zo\Z1. If Zz is finite, let Q(x, y, z) be the following nonzero element 


(z,y,2)= |] FCC )={ II an(¢)) (2 tory, za") eee + om(vs2)) 


CEZ2 6€Z2 


where m is the number of points (counting their era in Z2 and 
oi(y, z) is the i-th elementary symmetric function of { ft z fo: :¢ € Zp}. 
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Theorem 3.6.2. The implicit equation of the surface (S) is the square-free 
part of the numerator of 


E(a,y,z) = 2" + or(y,2)e™" +--+ + om(y,2) € Ky, 2) [2]. 


Proof. Let us choose a point (yo, 20) in the open subset U of K° such that the 


specialization Zo of Zo is finite in K and the denominators of 01,...,@m do 
not vanish. Then we have Q(2o, yo, 20) = 0 if and only if 


14.6 +am(yo,%) =0, 


rq + 01(Yo, 20) rq" 
which is equivalent to the existence of an element ¢) € Z> such that 


tp = + cn In other words, the numerator of E(x, y,z) vanishes on a point 


(20, Yo, 20) € U if and only if it belongs to (.S), which implies that the square- 
free part of the numerator of E(,y, z) is up to a scalar the implicit equation 
of the surface (S$). 


The coefficients o;(y, z) in Theorem 3.6.2 can be computed using the New- 
ton identities (3.8). So we need to compute the Newton sums Sj(y,z) = 


ce z, (F4),3 =0,...,m. By adding a variable we can assume that d; = 1. 


Algorithm 3.6.3 IMPLICITIZATION OF A RATIONAL SURFACE 
INPUT: Polynomials f,g,h,d,,d2,d3 in K[s, ¢]. 


1. Compute an algebraic relation As(uo, U1, U2) (resp. Az(uo, U1, U2)) between 
s,G=yd2—g9, H = zd3 —h (resp. t,G, H) in Kly, z][s, ¢]. 
e If the univariate polynomials R, = As(s,0,0), Ry = A;z(t,0,0) do not 
vanish identically (which is often the case), let M be the 2 x 2 matria 
Rs G 
such that ) =M @ . 
—~ Compute the degree 


Mm = T(G,H) (Jac(G, H)) = T(R,,R,) (Jac(G, H) det(M)) 


in x of the polynomial E(x, y,z) € K(y, z)[x] in Theorem 3.6.2. 
— For i from 1 tom, compute 


Si(y, 2) = T(G,H) (Jac(G, H)f*) = TR.,R,) (Jac(G, H) det(M) f*). 


e If the polynomial R,R, = 0, the power sums Si(y,z), fori =0,...,m, 
are computed using the algebraic relations As(uo, U1, U2), A¢(uo, U1, U2) 
and the formula in Proposition 3.3.8. 
2. Use the Newton identities (3.8) to obtain the elementary symmetric func- 
tions oi(y,z) from the Newton sums S;(y,z), i=1,...,m. 


ouTPUT: The numerator of x™ + o1(y,z)v”™~1+-++++om(y, 2) € K(y, z)[z]. 
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Example 8.6.1 (continued). In this case, the univariate polynomials R, and 
R, are equal to 


R, = —4+4+ 423 + 4s* + 4s? + 21572? — 16s?z — 4s°y — 1227 — 1023s? 
4279" —828* 4 47? 64 + Qyz2s+ 8ys°z — dys? z” — 427 ys 
+2zsy + y* — Qy2z + 27y? + 12z, 

R, =4z —4— 83 y + 8t3 yz + 164? — 20872 + 4274? + 4¢42? — Betz + 4t*. 


The computation of the Newton sums gives 


1 1 
So = 4,5, = y, So ge t4azty—2, S3 = — y(—3z? + 18z — 12 + 4y?) 


a4 


1 
Sg 2h — 228 — ay? + 92° — 122 + 6yP2 ty! — By 


And the implicit equation of (S) is 


4 


ik 
£ gy + mn gz? 


1 
Qa72z +27 4 ney + 24 


3.6.2 The position of a camera 


We consider a camera which is observing a scene. In this scene, three points 
A, B,C are identified. The center of the camera is denoted by X. We assume 
that the camera is calibrated, that is, we know the focal distance, the projec- 
tion of the center of the camera, ...Then, we easily deduce the angles between 
the rays X A, XB, XC from the images of the points A, B,C. 


We denote by a the angle between X B and XC, (3 the angle between X A and 
XC, y between XA and XB. These angles are deduced from the measure- 
ments in the image. We also assume that the distances a between B and C, b 
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between A and C, c between A and B are known. This leads to the following 
system of polynomial constraints: 

x? + ©% — 2cos(y)a122 — c? =0 

x7 + £3 — 2cos(8)r123 — b? = 0 (3.12) 
x3 + 2% — 2cos(a)x2r3 — a? = 0 


where x1 = |XA|, r2 = |XB|, x3 = |XC|. Once we know the distances 
1, %2, 23, the two symmetric positions of the center X are easily deduced. The 
system (3.12) can be solved by direct polynomial manipulations, expressing 
x2 and x3 in terms of x; from the two first equations and substituting in the 
last one. After removing the square roots, we obtain a polynomial of degree 
8 in x1, which implies at most 16 positions of the center X in this problem. 
Another simple way to get this equation is to eliminate the variables x2, x3, 
using the Bezoutian construction (from the MULTIRES package), and we obtain 


> melim([f1,£2,£3], [x2,x3]); 


2 cos(a) (64 cos(@)2.cos(a)2cos(y)? — 64 cos(8)3 cos(a) cos(y) — 64 cos(8) cos(a)? cos(y) + 16 cos(y)* 
—64 cos(3) cos(a) cos(y)? + 16 cos(8)* + 32 cos(8)2cos(a)? + 32 cos()2cos(y)? + 16 cos(a)* 
+32 cos(a)?cos(y)? + 64 cos(B) cos(a) cos(y) — 32 cos(8)? — 32 cos(a)? — 32 cos(y)? + 16) By... 


Once this equation of degree 8 in x1 is known, the numerical solving is easy. 


3.6.3 Autocalibration of a camera 


We consider here the problem of computing the intrinsic parameters of a 
camera from observations and measurements in 3 images of the same scene. 
Following the approach described in [Fau93], the camera is modeled by a pine 
hole projection. From the 3 images, we suppose that we are able to compute 
the fundamental matrices relating a pair of points in correspondence in two 
images. If m, m’ are the images of a point M € R® in two photos, we have 
m F'm’=0, where F is the fundamental matrix. 


w 
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From 3 images and the 3 corresponding fundamental matrices, we deduce the 
so-called Kruppa equations on the 6 intrinsic parameters of the camera. See 
[Krul3], [Fau93] for more details. This is a system of 6 quadratic homogeneous 
equations in 6 variables. We solve this overdetermined system by choosing 5 
equations among the six, solving the corresponding affine system and choosing 
the best solutions for the last equation among the 32 solutions. This took 0.38s 
on a Alpha 500Mhz workstation for the following experimentation: 


Exact root Computed root 


1.049401330318981 
4.884653820635368 
6.011985256613766 
.1726009605860270 
1.727887086410446 


1.049378730793354 
4.884757558650871 
6.011985146332036 
.1725610425715577 
1.727898150468536 


The solver used for this computation has been developed by Ph. Trébuchet 
[Tré02] and is available in the library SyNApS [DRMRT02] (see Solve(L, 
newmac<C>()). 


3.6.4 Cylinders through 4 and 5 points 


We consider the problem of finding cylinders through 4 or 5 points. The system 
that we use is described in [DMPT03]. 


The number of solutions for the problems that we consider are the following: 


Cylinders through 5 points: 6 = 3 x 3 — 3 solutions. 

Cylinders through 4 points and fixed radius: 12 = 3 x 4 solutions. 

Lines tangent to 4 unit balls: 12 solutions. 

Cylinders through 4 points and extremal radius: 18 = 3 x 10-—3x4 
solutions. 
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Here are experimental results also performed with the solver developed by 
Ph. Trébuchet: 


Problem time |max(|fi|) 

Cylinders through 5 points 0.038] 5- 10-9 

Parallel cylinders through 2x4 points _|0.03s} 5- 107° 
Cylinders through 4 points, extremal radius] 2.95} 107° 


The computation was performed on an Intel PII 400 128 MB of RAM. 
mazx(|f;|) is the maximum of the norm of the defining polynomials f; eval- 
uated at the approximated roots. The relatively huge time spent in the last 
problem is due to the treatment of multiple roots. 


3.6.5 Position of a parallel robot 


Consider a parallel robot, which is a platform controlled by 6 arms: 


From the measurements of the length of the arms, we would like to know the 
position of the platform. This problem is a classical benchmark in polynomial 
system solving. We know from [RV95, Laz93, Mou93] that this problem has 
at most 40 solutions and that this bound is reached [Die98]. Here is the 40 
degree curve that we obtain when we remove an arm of the mechanism: 
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The geometric constraints describing the position of the platform are trans- 
formed into a system of 6 polynomial equations: 


|R¥,+T-—X,|? -d2=0 , i=1,...,6, 


where R equals 


a——e+d 2ab —2cd 2ac +2 bd 
1 2 2 2 2 
2aP Lee 2ab+2cd —a’+be-—c+d 2bce—2ad 
2ac — 2bd 2ad+2bc -@7-P4+C4é 


i.e. the rotation of the platform with respect to a reference frame, and 
T = (u,v,w) is its translation. Using again the solver by Ph. Trébuchet 
and a different modelisation (with point coordinates in the first column, and 
quaternions in the second column), and one deduced from the residual resul- 
tant construction (in the column “redundant”) as described in [BusOla], and 
different numerical precision, we obtain the following results: 


Direct modelisation Quaternions Redundant 
250 b. 3.215)128 b. -|250 b. 8.465/128 b. 6.258250 b. 1.55]128 b. 1.2s 


Here n b. denotes the number n of bits used in the computation. 


3.6.6 Direct kinematic problem of a special parallel robot 


Resultant constructions can also be used for some special geometry of the 
platform. Here is an example where two attached points of the arms on the 
platform are identical. We solve this problem by using the Bezoutian formula- 
tion, which yields a 20 x 20 matrix of polynomials in one variable. The number 
of complex solutions is also 40. The code for the construction of the matrix is 
generated in a pre-processing step and the parameters defining the geometry 
of the platform are instantiated at run time. This yields the following results. 
There are 6 real solutions, one being of multiplicity 2: 
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We obtain the following error |||RY; + T — X;||? — d?| < 10~® and the time 
for solving is 0.5s on an Intel PII 400, 128 MB of RAM. 


3.6.7 Molecular conformation 


Similar resultant constructions can also be used, in order to compute the 
possible conformations of a molecule when the position and orientation of 
the links at the extremity are known. The approach is similar to the one 
described in [RR95]. It was developed by O. Ruatta, based on the SYNAPS 
library. Here also, the resultant matrix is constructed in a preprocessing step 
and we instantiate the parameters describing the geometry of the molecule at 
run-time. In this example, we obtain 6 real solutions among the 16 complex 
possible roots: 
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The numeric error on the solutions is bounded by 10~° and the time for solving 
is 0.090s, on a standard workstation. 


3.6.8 Blind identification in signal processing 


Finally, we consider a problem from signal processing described in detail in 
[GCMT02]. It is related to the transmission of an input signal x(n) of size p 
depending on the discrete time nm into a convolution channel of length L. The 
output is y(n) and we want to compute the impulse response matrix H(n) 
satisfying: 

L-1 


y(n) = S> H(m)x(n — m) + b(n), 


where b(n) is the noise. If b(n) is Gaussian centered, a statistic analysis of 
the output signal yields the equations: 

L-1 p 
do Pa,i(m)ha,(m)(—1)"-" = E(ya(n)ya(n — 1) , 

m=0 i=1 
where ha,;(m) are the unknowns and the E(ya(n)yg(n — 1)) are known from 
the output signal measurements. We solve this system of polynomial equations 
of degree 2 in 6 variables, which has 64 solutions for p = 1, with the algebraic 
solver of Ph. Trébuchet and we obtain the following results: 
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A real root 


x0} = -1.803468527372455 
xl} -5.162835380624794 
x2} -7.568759900599482 
x3} -6.893354578266418 
x4} -3.998807562745594 
xd} -1.164422870375179 


Error = 10-®, Time = 0.76s 


A 


An algebraist’s view on border bases 


Achim Kehrein', Martin Kreuzer!, and Lorenzo Robbiano? 


' Universitat Dortmund, Fachbereich Mathematik, 44221 Dortmund, Germany, 
achim.kehrein@mathematik.uni-dortmund.de 

martin. kreuzer@mathematik.uni-dortmund.de 

Dipartimento di Matematica, Via Dodecaneso 35, 16146 Genova, Italy, 
robbiano@dima.unige.it 


Summary. This chapter is devoted to laying the algebraic foundations for border 
bases of ideals. Using an order ideal O, we describe a zero-dimensional ideal from the 
outside. The first and higher borders of O can be used to measure the distance of a 
term from O and to define O-border bases. We study their existence and uniqueness, 
their relation to Grébner bases, and their characterization in terms of commuting 
matrices. Finally, we use border bases to solve a problem coming from statistics. 


4.0 Introduction 


El infinito tango me lleva hacia todo 
[The infinite tango takes me towards everything] 
(Jorge Luis Borges) 


The third author was invited to teach a course at the CIMPA school in 
July 2003. When the time came to write a contribution to the present volume, 
he was still inspired by the tunes of classical tango songs which had been 
floating in his mind since his stay in Buenos Aires. He had the idea to create 
some variations on one of the themes of his lectures. Together with the first 
and second authors, he formed a trio of algebraists. They started to collect 
scattered phrases and tunes connected to the main theme, and to rework them 
into a survey on border bases. Since the idea was welcomed by the organizers, 
you have now the opportunity to enjoy their composition. 


In the last few years it has become increasingly evident how Grdbner 
bases are changing the mathematical landscape. To use a lively metaphor, we 
can say that by considering a Grobner basis of an ideal J in the polynomial 
ring P = K[a1,...,%n], we are looking at I from the inside, i.e. by describing 
a special set of generators. But a Grobner basis grants us another perspective. 
We can look at I from the outside, i.e. by describing a set of polynomials which 
forms a K-vector space basis of P/I, namely the set of terms outside LT, (J) 
for some term ordering o. However, Grébner bases are not optimal from the 
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latter point of view, for instance, because the bases they provide tend to be 
numerically ill-behaved. 

This leads us to one of the main ideas behind the concept of a border 
basis. We want to find more “general” systems of generators of J which give 
rise to a K-basis of P/I. Quotation marks are in order here, since so far the 
generalization only works for the subclass of zero-dimensional ideals J. In the 
zero-dimensional case, the theory of border bases is indeed an extension of 
the theory of Grobner bases, because there are border bases which cannot be 
associated to Grébner bases. Moreover, border bases do not require the choice 
of a term ordering. Our hope is that the greater freedom they provide will 
make it possible to construct bases of P/I having additional good properties 
such as numerical stability or symmetry. 

Even if these considerations convince you that studying border bases is 
useful, you might still ask why we want to add this survey to the current 
literature on that topic? Our main reason is that we believe that the alge- 
braic foundations of border bases have not yet been laid out solidly enough. 
Important contributions are scattered across many publications (some in less 
widely distributed journals), and do not enjoy a unified terminology or a co- 
herent set of hypotheses. We hope that this chapter can be used as a first 
solid foundation of a theory which will surely expand quickly. 


Now let us look at the content more closely. In Section 4.1 we describe 
some techniques for treating pairwise commuting endomorphisms of finitely 
generated vector spaces. In particular, we describe a Buchberger-Moller type 
algorithm (see Theorem 4.1.7) for computing the defining ideal of a finite set 
of commuting matrices. Given pairwise commuting endomorphisms ¥1,..., Yn 
of a finite dimensional K-vector space V, we can view V as a P-module via 
f-vu = f(¢i,---,Yn)(v) for f € P and v € V. Then Theorem 4.1.9 yields 
an algorithm for checking whether V is a cyclic P-module, i.e. whether it is 
isomorphic to P/I for some zero-dimensional ideal I C P. 

Section 4.2 is a technical interlude where order ideals, borders, indices, 
and marked polynomials have their solos. An order ideal is a finite set of 
terms which is closed under taking divisors. We use order ideals to describe 
a zero-dimensional ideal “from the outside”. The first and higher borders of 
an order ideal can be used to measure the “distance” of a term from the 
order ideal. The main tune in Section 4.2 is played by the Border Division 
Algorithm 4.2.10. It imitates the division algorithm in Grébner basis theory 
and allows us to divide a polynomial by a border prebasis, i.e. by a list of 
polynomials which are “marked” by the terms in the border of an order ideal. 

And then, as true stars, border bases appear late in the show. They enter 
the stage in Section 4.3 and solve the task of finding a system of generators of 
a zero-dimensional polynomial ideal having good properties. After we discuss 
the existence and uniqueness of border bases (see Theorem 4.3.4), we study 
their relation to Grobner bases (see for instance Propositions 4.3.6 and 4.3.9). 
Then we define normal forms with respect to an order ideal, and use border 
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bases to compute them. Many useful properties of normal forms are collected 
in Proposition 4.3.13. 

In the final part of Section 4.3, we explain the connection between border 
bases and commuting matrices. This variation leads to the fundamental The- 
orem 4.3.17 which characterizes border bases in terms of commuting matrices 
and opens the door for our main application. Namely, we use border bases 
to solve a problem coming from statistics. This application is presented in 
Section 4.4, where we discuss the statistical background and explain the role 
of border bases in this field. 

Throughout the text, we have tried to provide a generous number of ex- 
amples. They are intended to help the reader master the basics of the theory 
of border bases. Moreover, we have tried to keep this survey as self-contained 
and elementary as possible. When we had to quote “standard results” of 
computer algebra, we preferred to rely on the book by the second and third 
authors [KR00]. This does not mean that those results are not contained in 
other books on the subject; we were merely more familiar with it. 


Albert Einstein is said to have remarked that the secret of creativity was 
to know how to hide ones sources. Since none of us is Albert Einstein, we 
try to mention all sources of this survey. We apologize if we are unaware of 
some important contribution to the topic. First and foremost, we would like to 
acknowledge the work of Hans J. Stetter (see [AS88], [AS89], and [Ste04]) who 
used border bases in connection with problems arising in numerical analysis. 
Later H. Michael Moller recognized the usefulness of these results for computer 
algebra (see [M6193], [MS95], and [MT01]). These pioneering works triggered a 
flurry of further activities in the area, most notably by Bernard Mourrain (see 
for instance [Mou99]) from the algorithmic point of view. A good portion of 
the material presented here is taken from the papers [CR97], [CRO1], [KK03a], 
[Rob98], [Robb], and [RR98]. Moreover, many results we discuss are closely 
related to other surveys in this volume. 

Naturally, much work still has to be done; or, as we like to put it, there is 
still a huge TODO-list. A path which deserves further attention is the connec- 
tion between border bases and numerical computation. Many ideas about the 
interplay of numerical and symbolic computation were proposed by Stetter, 
but we believe that there remains a large gap between the two areas which 
has to be addressed by algebraists. What about the algorithmic aspects? Al- 
most no computer algebra system has built-in facilities for computing border 
bases. Naive algorithms for computing border bases, e.g. algorithms based 
on Grébner basis computations, require substantial improvements in order to 
be practically feasible. This is an area of ongoing research. Some results in 
this direction are contained in Chapter 3. On the theoretical side we can ask 
whether the analogy between border bases and Grobner bases can be further 
extended. First results in this direction are contained in [KK03a], but there 
appears to be ample scope for extending the algebraic theory of border bases. 
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Finally, wouldn’t it be wonderful to remove the hypothesis that I is zero- 
dimensional, i.e. to develop a theory of border bases for the case when P/I 
is an infinite-dimensional vector space? At the moment, despite the infinite 
tango, we are unfortunately lacking the inspiration to achieve this goal. Some 
ideas are presented in [Ste04, Ch. 11]. 


As for our notation, we refer the readers to [KROO]. In particular, we let 
P = K{[21,...,2n] be a polynomial ring over a field K. A polynomial of 
the form xf!---a°", where aj,...,a,, € N, is called a term (or a power 
product). The monoid of all terms in P is denoted by T”. 


4.1 Commuting endomorphisms 


Tango has the habit of waiting 
(Anibal Troilo, virtuoso bandoneonist) 


Every polynomial ideal I is accompanied by the quotient algebra P/T. 
A zero-dimensional ideal I corresponds to an algebra P/J of finite vec- 
tor space dimension over K. The first part of this section reviews how the 
K-algebra P/TI is characterized by its P-module structure and how the latter is 
given by n pairwise commuting multiplication endomorphisms of the K-vector 
space P/T. In particular, for zero-dimensional ideals these endomorphisms can 
be represented by pairwise commuting multiplication matrices. Then we ad- 
dress the converse realization problem: Which collections of n pairwise com- 
muting matrices can be preassigned as multiplication matrices corresponding 
to a zero-dimensional ideal? A necessary and sufficient condition is that these 
matrices induce a cyclic P-module structure. Whether a P-module structure 
on a finite-dimensional K-vector space is cyclic can be checked effectively — 
an algorithm is presented in the second part. 


4.1.1 Multiplication endomorphisms 


Given a K-vector space V which carries a P-module structure, there exist 
endomorphisms of V which are associated to the multiplications by the inde- 
terminates. 


Definition 4.1.1. Fori=1,...,n, the P-linear map 
gyi: Va VY defined by vreaju 
is called the i*® multiplication endomorphism of V. 


The multiplication endomorphisms of V are pairwise commuting, i.e. we have 
pil vy; = p; ° y: for i,j € {1,...,n}. The prototype of such a vector space is 
given by the following example. 
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Example 4.1.2. Let I C P be an ideal. The quotient algebra P/I possesses a 
natural P-module structure P x P/I — P/I given by (f,g+I) - fg +I. 
Hence there are canonical multiplication endomorphisms X;: P/I — P/I 
such that X;(f+/) =a; f+J for f€ Pandi=1,...,n. Note that P/Tisa 
cyclic P-module with generator 1+ I. 


Remark 4.1.8. Let y1,..., Qn be pairwise commuting endomorphisms of a vec- 
tor space V. The following three constructions will be used frequently. 


1. There is a natural way of equipping V with a P-module structure such 
that y; is the i** multiplication endomorphism of V, namely the structure 
defined by 


PxV —V such that (f,v) — f(¢1,---;Yn)(v) 
2. There is a ring homomorphism 
n: P —> Endx(V) such that fr f(yi,.--, Qn) 


3. Every ring homomorphism 7 : P —> Endx(V) induces a P-module struc- 
ture on V via the rule f- uv = 7(f)(v). 


The following result allows us to compute the annihilator of V, i.e. the 
ideal Annp(V) = {f €¢ P| f-V = 0}. 


Proposition 4.1.4. Let V be a K-vector space equipped with a P-module 
structure corresponding to a ring homomorphism n : P —> Endx«(V). Then 
we have Annp(V) = ker(7). 


Proof. By Remark 4.1.3, we have f -V = 0 if and only if n(f) = 0. 


Of particular interest are P-module structures on V for which V is a cyclic 
P-module. The following proposition shows that such structures are essentially 
of the type given in Example 4.1.2. 


Proposition 4.1.5. Let V be a K-vector space and a cyclic P-module. Then 
there exist an ideal I C P and a P-linear isomorphism 


0: P/I — V 


such that the multiplication endomorphisms of V are given by the formula 
yi =~ O0X;007! fori=l,...,n 


Proof. Let w € V be a generator of the P-module V. Then the P-linear map 
6: P—V given by 1 + w is surjective. Let I = ker © be its kernel and 
consider the induced isomorphism of P-modules 90 : P/I — V. The P-linearity 
of O shows O(X;(g + 1)) = yi(O(g+ 1) for 1 <i<nandg+IJ€ P/I. 
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By [KR00], Proposition 3.7.1, zero-dimensional ideals J C P are charac- 
terized by dimx(P/I) < oo. Hence, if the vector space V in this proposition 
is finite-dimensional, the ideal I is necessarily zero-dimensional. Now we want 
to answer the question, given ¥1,...,%n, when is V a cyclic P-module via 
the structure defined in Remark 4.1.3.1? We note that if the P-module V is 
cyclic, then there exists an element w € V such that Annp(w) = Annp(V). 


Proposition 4.1.6 (Characterization of Cyclic P-Modules). 
Let V be a K-vector space which carries the structure of a P-module. 


1. Given w € V, we have Annp(V) C Annp(w). In particular, there exists a 
P-linear map Vy: P/ Annp(V) — V defined by f + Annp(V) > f-w. 

2. Let w € V. The map Vy is an isomorphism of P-modules if and only 
if w generates V as a P-module. 


Proof. The first claim follows from the definitions. To prove the second claim, 
we note that if W,, is an isomorphism, then we have V = P.- w. Conversely, 
suppose that V = P-w. Then the map W,, is surjective. Let f € P be such that 
f+Annp(V) € ker(%,). Then f(y1,..-,;n)-w = 0 implies f(y1,..., Yn) = 0 
since w generates V. Hence we see that f € Annp(V) and Wy, is injective. 


4.1.2 Commuting matrices 


In what follows, we let V be a finite-dimensional K-vector space and p its 
dimension. We fix a K-basis V = (v1,...,v,) of V. Thus every endomorphism 
of V can be represented by a matrix of size yz x 4 over K. In particular, when V 
is a P-module, then Mj,,...,M,, denote the matrices corresponding to the 
multiplication endomorphisms 1,...,Pn- 

Using the following variant of the Buchberger-Moller algorithm, we can 
calculate Annp(V) as the kernel of the composite map 


n: P —> Endg(V) = Mat, (i) 


where 7 is the map defined in Remark 4.1.3.2. Moreover, the algorithm 
provides a vector space basis of P/Annp(V). To facilitate the formula- 
tion of this algorithm, we use the following convention. Given a matrix 
A = (aj;) € Mat,(&), we order its entries by letting aj; ~< axe if i < k, 
or ift =k and j < £&. In this way we “flatten” the matrix to a vector in Ke. 
Then we can reduce A against a list of matrices by using the usual Gaufian 
reduction procedure. 


Theorem 4.1.7 (The Buchberger-M6ller Algorithm for Matrices). 
Let o be a term ordering on T”, and let My,...,Mn € Mat,,(K) be pairwise 
commuting matrices. Consider the following sequence of instructions. 


M1. Start with empty lists G=[],O=[], S=[], N=[], and a list L = [1]. 
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M2. If L =|], return the pair (G,O) and stop. Otherwise let t = ming(L) and 
delete it from L. 

M83. Compute t(Mj,...,;Mn) and reduce it against N = ([M,...,Nz]) to 
obtain : 

R=t(My,...,Mn)- > CN; with Ge k 
i=1 

M4. If R = 0, then append the polynomial t — )7, cs; to the list G, where s; 
denotes the i” element of S. Remove from L all multiples of t. Continue 
with step M2. 

M5. If R # 0, then append R to the list N and t — 50, cs; to the list S. 
Append the term t to O, and append to L those elements of {x,t,..., Unt} 
which are neither multiples of a term in L nor in LT,(G). Continue with 
step M2. 


This is an algorithm which returns the reduced o-Grébner basis G of Annp(V) 
and a list of terms O whose residue classes form a K-vector space basis 


of P/ Annp(V). 


Proof. Let I = Annp(V), and let H be the reduced o-Grobner basis of I. 

First we prove termination. In each iteration either step M4 or step M5 
is performed. By its construction, the list N always contains linearly inde- 
pendent matrices. Hence step M5, which appends an element to N, can be 
performed only finitely many times. By Dickson’s Lemma (see [KR00], Corol- 
lary 1.3.6), step M4 can be performed only finitely many times. Thus the 
algorithm terminates. 

To show correctness, we prove that after a term t has been treated by the 
algorithm, the following holds: the list G contains all elements of H whose 
leading terms are less than or equal to t, and the list O contains all elements 
of T” \ LT,(Z) which are less than or equal to t. 

This is true after the first term t = 1 has been treated, i.e. appended 
to O. Now suppose that the algorithm has finished an iteration. By the 
method used to append new terms to L in step M5, all elements of the set 
(210 U-+- 2,0) \ (OULT,(J)) are contained in L. From this it follows that 
the next term ¢ chosen in step M2 is the smallest term in T” \ (OULT,(J)). 
Furthermore, the polynomials appended to S$ in step M5 are supported in O. 
Hence the polynomial ee 1, 48; resulting from step M3 of the next iteration 
has leading term t. 

Now suppose that R = 0 in step M4. By construction, the matrix of 
the endomorphism 7(s;) is Mj; for 7 = 1,...,k. Therefore the polynomial 
g= een cS; is an element of J = Annp(V). Since the support of = Ci 8; 
is contained in O, the polynomial g is a new element of H. 

On the other hand, if R 4 0 in step M5, then we claim that the term ¢ is 
not contained in LT,(J). In view of the way we update L in step M5, the 
term ¢ is not in LT,(G) for the current list G. By induction, the term t is 
not a proper multiple of a term in LT,(H). Furthermore, the term ¢ is not 
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the leading term of an element of H because such an element would be of the 
form t — )\*_, cs; € I with c, € K in contradiction to R ¥ 0. Altogether it 
follows that t is an element of T” \ LT,(Z) and can be appended to O. 

In both cases we see that the claim continues to hold. Therefore, when the 


algorithm terminates, we have computed the desired lists G and O. 


Let us illustrate the performance of this algorithm with an example. 


Example 4.1.8. Let V = Q3, and let V = (€1,€2,e€3) be its canonical basis. 
Since the two matrices 


O11 010 
M, = 021 and My = O11 
O11 010 


commute, they define a Q[x, y|-module structure on V. Let us follow the itera- 
tions of the algorithm in computing the reduced o-Grobner basis of Annp(V), 
where o = DegLex. 


100 
1t=1,L=[],R 010] =Zs, N = [Zs], S = [1], O = [1], L = [a, y}. 
001 
010 
2.t=y,L=[z],R= | 011) = Ma, N = [Z3, Mg], S = [1,y], O = [1,y], 
010 
E= [z,y’). 
001 
Lt] 2L= pen = O10] = My, — Mao, N = [Zo,Me2,M1 — Mg], 
001 
S= [Ly a= y]; O= (1,2, y], L= [ae oyu" 
000 
4.t=y?, L=[z?, ry], R= [000] = M3—My-(M,—Mz), G = [y?—-2]. 
000 
000 
B.t=ay, L= [x], R= 1000] =M\Me2-2Me (M, Ma), 
000 
G=|y?—2a,2y-2—y). 
000 
6.t=27, L [], R 000] = Mi 3M, — 2(M1 — Ma), 
000 


G=[y?-—2, sy—ax—y, x? -2r—y). 


Thus we have Annp(V) = (y?—2, ry—x—y, 2? —2x2—y), and O = {1,z2, y} 


represents a K-basis of P/ Annp(V). 


Now we are ready for the main algorithm of this subsection: we can check 
effectively whether a P-module structure given by commuting matrices defines 
a cyclic module. 


4 Border Bases ee 


Theorem 4.1.9 (Cyclicity Test). 

Let V be a finite-dimensional K-vector space with basis V = (v1,..-,Um); 
and let ~1,..., Pn be pairwise commuting endomorphisms of V given by their 
respective matrices M,,...,Mn. We equip V with the P-module structure 
defined by ~1,.--,n- Consider the following sequence of instructions. 


C1. Using Theorem 4.1.7, compute a tuple of terms O = (t1,...,t,) whose 
residue classes form a K-basis of P/ Annp(V). 

C2. If dimg(V) 4 p, then return "V is not cyclic" and stop. 

C8. Let 21,..., 2, be further indeterminates and A € Mat, (K[z1,...,Zy]) the 
matriz whose columns are ti(My1,..., Mn): (21,---, 2)" fori=1,...,p. 
Compute the determinant d = det(A) € K[z1,..., 2]. 

C4. Check if there exists a tuple (c1,...,¢.) € K" such that the polynomial 
value d(ci,...,Cy,) is non-zero. In this case return "V is cyclic" and 
W = CV, +++ + CyUp,. Then stop. 

C5. Return "V is not cyclic" and stop. 


This is an algorithm which checks whether V is a cyclic P-module via 
P1,--+,Yn and, in the affirmative case, computes a generator. 


Proof. This procedure is clearly finite. Hence we only have to prove correct- 
ness. By Proposition 4.1.6, we have to check whether %, : P/ Annp(V) —> V 
is an isomorphism for some w € V. For this it is necessary that the dimen- 
sions of the two vector spaces agree. This condition is checked in step C2. 
Then we use the basis elements {t;,...,¢,,} and examine their images for lin- 
ear independence. Since we have W,(t;) = ti(y1,.--,n)(w) for i =1,..., 4, 
the map W,, is an isomorphism for some w € V if and only if the vectors 
{ti(Mj,...,Mn)(c1,---,¢.)™ | 1 < i < po} are K-linearly independent for 
some tuple (c1,...,¢,) € K". This is checked in step C4. 


If the field K is infinite, the check in step C4 can be simplified to checking 
d # 0. For a finite field K’, we can, in principle, check all tuples in kK“. Let us 
apply this algorithm by applying it in the setting of Example 4.1.8. 


Example 4.1.10. Let V and M,, Mg be defined as in Example 4.1.8. We follow 
the steps of the cyclicity test. 


C1. The residue classes of O = {1, x,y} form a K-basis of P/ Annp(V). 
C2. We have pp = 3 = dimg(V). 
C3. We compute T3 : (21, 225 23) = (21, 225 23)" as well as M, : (z1, 22, 23)" = 
(z9 + 23,222 + 23,22 + 23)" and Mz. - (21, 22, 23)" = (Za, 22 + 23, 22)". 
21 2217 23 22 
Thus we let A = | 22 2z2 +23 z2 +23] and calculate d = det(A) = 
23 22 1 23 22 
(21 — 23)(23 — z223 — 23). 
C4. Since K is infinite and d ¥ 0, the algorithm returns "V is cyclic". For 
instance, since d(1,1,0) = 1, the element w = e; + eg generates V as a 
P-module. 
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The following example shows that V can fail to be cyclic even when the 
dimensions of V and P/ Annp(V) agree. 


Example 4.1.11. Let V = Q3, and let V = (e1, €2, €3) be its canonical basis. We 
equip V with the Q|z, y]-module structure defined by the commuting matrices 


000 000 
My, = 100 and My = 001 
000 000 


Let us apply the cyclicity test step-by-step. 


Cl. The algorithm of Theorem 4.1.7 yields O = {1, 2, y}. 
C2. We have pp = 3 = dimg(V). 
Zl 0 0 
C3. We calculate A= | zg 21 23 | and d= det(A) =0. 
23, 0 0 
C5. The algorithm returns "V is not cyclic". 


We end this section by considering the special case n = 1. In this univariate 
case some of the topics discussed in this section look very familiar. 


Example 4.1.12. Suppose we are given a finitely generated K-vector space V 
and an endomorphism y of V. We let P = Ka] and observe that V becomes 
a P-module via the rule (f,v) — f(y)(v). When is it a cyclic P-module? Let 
us interpret the meaning of the steps of our cyclicity test in the univariate 
case. To start with, let M be a matrix representing y. 


C1. The algorithm of Theorem 4.1.7 applied to M yields a monic polynomial 
f(x) = x4 + cg_ya4! +--+ +9, which is the minimal polynomial of M 
(and of w), and the tuple O = (1,2, 2?,..., 27+). 

C2. The minimal polynomial of M is a divisor of the characteristic polynomial 
of M, and the degree of the latter is dimx(V). So the algorithm stops at 
step C2 only if the minimal polynomial and the characteristic polynomial 
differ. 

C3. The matrix A can be interpreted as the matrix whose columns are the 
vectors v, p(v),...,¢4¢1(v) for a generic v. If det(A) = 0, then the endo- 
morphisms 1, y,..., 4! are linearly dependent, a contradiction. Hence 
det(M) necessarily is non-zero and V is a cyclic P-module. 


In conclusion, steps C3, C4, C5 are redundant in the univariate case. This 
corresponds to the well-known fact that V is a cyclic K[x]-module if and only 
if the minimal polynomial and the characteristic polynomial of y coincide. 
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4.2 Border prebases 


Given a zero-dimensional polynomial ideal J, we want to study the residue 
class ring P/I by choosing a K-basis and examining the multiplication ma- 
trices with respect to that basis. How can we find a basis having “nice” prop- 
erties? One possibility is to take the residue classes of the terms in an order 
ideal, i.e. in a finite set of terms which is closed under forming divisors. 

The choice of an order ideal O yields additional structure on the monoid 
of terms T”. For instance, there are terms forming the border of O, i.e. terms 
t outside O such that there exist an indeterminate x; and a term t’ in O 
with t = a;t’. Moreover, every term t has an O-index which measures the 
distance from t to O. The properties of order ideals, borders, and O-indices 
are collected in the first subsection. 

The second subsection deals with O-border prebases. These are sets of 
polynomials each of which consists of one term in the border of O and a 
linear combination of terms in O. Using O-border prebases, we construct a 
division algorithm and define normal remainders. 


4.2.1 Order ideals 


Let T” denote the monoid of terms in n indeterminates. Moreover, for every 
d > 0, we let T7 be the set of terms of degree d and TZ, = ees T?. The 
following kind of subset of T” is central to this section. 


Definition 4.2.1. A non-empty, finite set of terms O C T” is called an order 
ideal if it is closed under forming divisors, i.e. ift € O andt’ | t implyt’! € O. 


Order ideals have many other names in the literature. For instance, sta- 
tisticians sometimes call them complete sets of estimable terms (see Sec- 
tion 4.4). In Chapter 3, the more general notion of “sets of polynomials con- 
nected to 1” is used. 


Definition 4.2.2. Let O C T” be an order ideal. 
1. The border of O is the set 


80 =T"-O\ O= (a, 0U---Ut,0)\O 


The first border closure of © is the set OO = OU OO. 
2. For every k > 1, we inductively define the (k +1)** border of O by 
a*+10 = A(OFO) and the (k + 1)* border closure of O by the rule 
Ok+10 = OKO UAFt!0. For convenience, we let PO = 0°O =O. 


The k*” border closure of an order ideal O is an order ideal for every k > 0. 
In Chapter 3, the k*" border of O is denoted by Ol". 


Example 4.2.3. Let O be the order ideal {1, 2, y, 27, zy, y”, x3, xy, y?, 2+, x? y} 
in T?. Then we visualize O and its first two borders as follows. 


180 A. Kehrein, M. Kreuzer, and L. Robbiano 


@@e@oo x 
@ @o x 
@ @o x 
@ Oo x 


H+ @—_e—_e—_e 


p x 
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Let us collect some properties of order ideals, their borders and border 
closures. 


Proposition 4.2.4 (Basic Properties of Borders). 
Let O C T” be an order ideal. 


1. For every k > 0, we have a disjoint union OKO = es oO; 

2. For every k >1, we have O*O =T?-O\T2,-O. 

3. We have a disjoint union T” =U, 00. 

4. A termt € T” is divisible by a term in 0O if and only ift ET" \ O. 


Proof. The definition of the first border closure of O yields OO = OUT? - O. 
Inductively, it follows that OF+10 = OFOUT? - 0FO = OFO UTE, ,O. This 
proves the first claim. Then the second claim is a consequence of the equality 
O*+10 = Ak+1O \ OKO. The third claim follows from the observation that 
every term is in OkO for some k > 0. 

Finally, the fourth claim holds because the second claim implies the fact 
that t € 0*O for some k > 1 is equivalent to the existence of a factorization 
t=t't" where deg(t’) = k-—1 and t” € 00. 


The above partition of T” allows us to define a “distance” between a term 
and an order ideal. 


Definition 4.2.5. Let O C T” be an order ideal. 


1. For everyt € T”, there exists a unique number k € N such that t € OFO. 
We call k the index of t with respect to O and write indo(t) = k. 

2. For an arbitrary polynomial f € P \ {0}, we define the index of f with 
respect to O by indo(f) = max{indo(t) | t € Supp (f)}. 


By this definition, the k** border of O consists precisely of the terms 
of index k. Notice that every polynomial f € P \ {0} has a representation 
f =cati +--+ + cst, where c1,...,cs € K \ {0} and such that t1,...,¢, € T” 
satisfy indo(t,) > --: > indo(ts). However, this representation is in general 
not unique since several terms in the support of f may have the same index 
with respect to O. 

Let us point out some of the most useful properties of the index. 
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Proposition 4.2.6. Let O C T” be an order ideal. 


1. Foratermt € T”, the number k = indo(t) is the smallest natural number 
such that t= t't” with a term t'! © T” of degree k and with t” € O. 

2. Given two terms t,t! € T”, we have indo(tt’) < deg(t) + indo(t’). 

3. For f,g € P\ {0} such that f +g 40, we have 


indo(f +g) < max{indo(f), indo(g)} 
4. For f,g € P\ {0}, we have 


indo(f g) < min{deg(f) + indo(g), deg(g) + indo(f)} 


Proof. The first claim follows from the proof of Proposition 4.2.4.4. The second 
claim follows from the first. The third claim is a consequence of the inclusion 
Supp (f+g) C Supp (f)USupp (g). The last claim follows from the observation 
that Supp (fg) C {t’t” | t’ © Supp(f), ¢” € Supp (g)} and from the second 
claim. 


Although the partial ordering on T” defined by the index appears similar 
to a term ordering, it has a serious drawback: this ordering is incompatible 
with term multiplication, ie. indo(t) > indo(t’) does not, in general, imply 
indo(tt”) > indo(t’ t”). Our next example is a case in point. 


Example 4.2.7. Let O = {1,2,x?} Cc T(a,y). Then O is an order ideal with 
border 00 = {y, xy, x?y, x°}. The following sketch illustrates the situation. 


y! 


1 = 
Multiplying the terms on both sides of the inequality indo(y) > indo(x?) 
by x7, we get indo (x? - y) < indo(x? - x”). Similarly, if we multiply the terms 
on both sides of the equality indo(y) = indo(x?y) by x, we get the inequality 
indo(x-y) < indo(a- x”y). 


4.2.2 Border division 


In this subsection we introduce an important tool for dealing with zero- 
dimensional ideals: an O-border prebasis, i.e. a set of polynomials of which 
each is a linear combination of one term in 0O and terms in ©. In this way 
we imitate the definition of a Grobner basis where each polynomial is a lin- 
ear combination of the leading term and smaller terms. Then we present a 
process for dividing arbitrary polynomials by those of an O-border prebasis. 
However, the remainder of this division process is not uniquely determined. 
This indicates that O-border prebases are a first step in the right direction 
and that we must take one more step in the next section. 
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Definition 4.2.8. Let O = {t1,...,t,} be an order ideal, and let OO = 
{bi,...,by} be its border. A set of polynomials G = {g1,...,gv} is called an 
O-border prebasis if the polynomials have the form g; = bj — vi, aizti 
such that a,j €¢ K forl<i<pandl<j<v. 


In particular, a border prebasis can be interpreted as a tuple of polynomials 
marked by the border terms (b;,...,6,) in the following sense. 


Definition 4.2.9. A pair (g,b) is said to be a marked polynomial if g is a 
non-zero polynomial and b € Supp (g) with coefficient 1. A tuple of polynomials 
(g1,---;9v) is marked by a tuple of terms (b1,...,b.) if (g1,01),---, (gv; bv) 
are marked polynomials. 


The definition of a border prebasis only fixes the shape of our generators. 
Note that this notion requires a bit more than that of marked polynomials — 
the unmarked terms in the polynomial’s support have to be in the order ideal. 
Border prebases are already sufficient to perform polynomial division with 
remainder. The following algorithm provides a fundamental tool in working 
with border prebases. It is similar to the procedure called “B-reduction” in 
Chapter 3. 


Proposition 4.2.10 (Border Division Algorithm). 

Let O = {t1,...,t,} be an order ideal, let OO = {bi,...,b,} be its border, and 
let {g1,.--, gv} be an O-border prebasis. Given a polynomial f € P, consider 
the following instructions. 


Di. Let fy =-:+ = fy =0, gq =--: =e, = 0, andh= f. 

D2. If h =0, then return (fi,..-, fv,¢1,---,Cn) and stop. 

D3. If indo(h) = 0, then find c,,...,c, € K such that cyt) + ---+¢,t, = h. 
Return (fi,.--, fv, C1,--+;¢x) and stop. 

D4. Ifindo(h) > 0, then leth = ayhi+---+ash, such that a1,...,a, € K\{0} 
and hy,...,hs € T” satisfy indo(h1) = indo(h). Determine the smallest 
index i € {1,...,v} such that hy factors as h, = t’ b; and so that the term 
t! € T” has degree indo(h)—1. Subtract ait'g; from h, add a,t' to fi, and 
continue with step D2. 


This is an algorithm which returns a tuple (fi,..., fv, ¢1,-+-,Cu) € PY” x KY 
such that 


f=fhiat::-+thotatit+::-4 Culp 
and deg(fi) <indo(f) —1 for alli € {1,...,v} with fig; 4 0. This represen- 
tation does not depend on the choice of the term h, in Step D4. 


For the reader’s convenience we reproduce the proof from [KK08a]. 


Proof. First we show that Step D4 can be executed. Let k = indo(h,). By 
Proposition 4.2.4.2, there is a factorization h, = tt; for some term t of degree k 
and some t; € O, and there is no such factorization with a term t of smaller 
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degree. Since k > 0, we can write ¢ = t’ x; for some t’ € T” and j € {1,...,n}. 
Then we have deg(t’) = k — 1, and the fact that t has the smallest possible 
degree implies z;t; € OO. Thus we have h; = t'(x;t;) = t'b, for some 
b, € OO. 

Next we prove termination. We show that Step D4 is performed only fi- 
nitely many times. Let us investigate the subtraction h — a,t’g; in Step D4. 
Using the representation g; = bj — a Qpity given in Definition 4.2.8, this 
subtraction becomes 


h— at! gj — ayh, toeee ashs ayt'b; + at! 3 Akitk 
k=1 
Now a,h, = a,t’b; shows that a term of index indo(h) is removed from h and 
replaced by terms of the form ¢’ t; € 0-10 which have strictly smaller index. 
The algorithm terminates after finitely many steps because, for a given term, 
there are only finitely many terms of smaller or equal index. 
Finally, we prove correctness. To this end, we show that the equation 


f= h + fign poe ap fog + Cty ++:-4 Cuty 


is an invariant of the algorithm. It is satisfied at the end of Step D1. A poly- 
nomial f; is only changed in Step D4. There the subtraction h — ay,t’g; is 
compensated by the addition (f; + ait’)g;. The constants ci,...,c, are only 
changed in Step D3 in which h is replaced by the expression cyt, +--+ + Cyty. 
When the algorithm stops, we have h = 0. This proves the stated representa- 
tion of f. The additional claim that this representation does not depend on 
the choice of hy in Step D4 follows from the observation that h; is replaced 
by terms of strictly smaller index. Thus the different executions of Step D4 
corresponding to the reduction of several terms of a given O-index in h do not 
interfere with one another, and the final result — after all those terms have 
been rewritten — is independent of the order in which they have been taken 
care of. 


Notice that in Step D4 the algorithm uses a representation of h which is 
not necessarily unique. Moreover, to make the factorization of h, unique, we 
chose the index i minimally, but this choice had not been forced upon us. 
Finally, the result of the division depends on the numbering of the elements 
of 0O, as our next example shows. 


Example 4.2.11. Let n = 2, and let O = {t1, to,t3} with t; = 1, tg = a, and 
tz = y. Then the border of O is OO = {b,,b2,b3} with b) = x, bo = xy, 
and bs = y?. The polynomials g, = 2? +2+1, go = ry+y, and g3 = 
y?+x+1 constitute an O-border prebasis. We want to divide the polynomial 
f = xy? — xy? + 2? + 2 by this O-border prebasis. 

For easy reference, the next borders are 070 = {x?, x7y, ry”, y>}, BO = 
{x*, xy, x7 y,ry?, y*}, and 040 = {25, x+y, xy”, 2?y°, cy*,y°}. We apply 
the Border Division Algorithm and follow its steps. 
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D1. Let fi = fo = fs =0, 1 = co = c3 = 0, and h = zy? — ry? +27 +2. The 
O-indices of the terms in h are 4,2,1 and 0 respectively, so h has index 4. 

D4. We have x3y? = ry? - b; with deg ry? = ind(h) — 1. Thus we put fi = ry? 
and h = x3 y? — xy? + 22 +2 —xy?(x2+2+1). The terms in the support 
of h = —a?y? — 2ay? + x? +2 have O-indices 3,2,1 and 0 respectively. 

D4. We have xy? = y?-b; with deg y? = ind(h) — 1. Add —y? to f; to obtain 
fi = 2y? —y? and put h = —xy? — Qny? + 227 +2+ y?(x? +241). The 
terms in the support of h = —ry? +a? +y? +2 have O-indices 2,1,1 and 0 
respectively. 

D4. We have ry? = y- by with deg y = ind(h) — 1. Put fo = —y and put h = 

ry? +a7+y?2+2+y(xy+y). The terms in the support of h = 7?+2y?+2 
have O-indices 1,1 and 0 respectively. 

D4. We have 2? = 1-6; with deg1 = ind(h) — 1. Add 1 to f; to obtain 
fi = ay? —y? +1 and put h = 2? + 2y?+2-—1(2%+a2+1). The terms in 
the support of h = 2y? — x + 1 have O-indices 1,0 and 0 respectively. 

D4. We have y? = 1-63 with deg 1 = ind(h) — 1. Add 2 to f3 to obtain f3 = 2 
and put h = 2y? —x+1-— 2(y?+2+41). The terms in the support of 
h = —3x — 1 have O-indices 0 and 0. Thus indo(h) = 0. 

D3. We have h = —1-t; —3t2+0t3. The algorithm returns the following tuple 
(xy? — y2 +1,—-y,2,1, —3, 0) and stops. 


Therefore we have a representation 


f = (ay — y° +1)91 — yg2 +293 —1t — 3t2 + O83 


Second we perform the algorithm with respect to the shuffled tuple 
(91 92>93) = (93,9291). 


D1. Let fi = fo = fs =0, 1 = co = c3 = 0, and h = zy? — zy? +27 +2. The 
O-indices of the terms in in the support of h are 4,2,1 and 0 respectively, 
so h has index 4. 

D4. We have x*y? = x? - bi with deg x? = ind(h) — 1. Thus we put f/ = 2°? 
and h = xy? — zy? +274+2—-—23(y? +2+1). The terms in the support 
of h = —a* — x? — vy? + x? +2 have O-indices 3,2,2,1 and 0 respectively. 

D4. We have x* = x? - bs with deg x? = ind(h) — 1. Add —2? to f§ to obtain 
f§ = 2? and put h = —24-23 —ay?+274+2+27(2?+2+1). The terms in 
the support of h = —ay? + 2x? + 2 have O-indices 2,1, and 0 respectively. 

D4. We have ry? = x - bi, with degx = ind(h) — 1. Add z to f{ to obtain 
fi =2° +2 and put h = —xy? + 227+2+2(y? +y4+1). The terms in the 
support of h = 2x? + ey +a +2 have O-indices 1,1,0 and 0 respectively. 

D4. We have x? = 1-64 with deg1 = ind(h) — 1. Add 2 to f$ to obtain 
f§ =a? +2 and put h = 227+ 2y+x24+2-2(a7+2+1). The terms in 
the support of h = xy — x have O-indices 1 and 0 respectively. 

D4. We have xy = 1-65 with deg 1 = ind(h) — 1. Add 1 to f§ to obtain fS = 1 
and put h = cy —x—1(ay+y). The terms in the support of h = x — y 
have O-indices 0 and 0. Thus we have indo(h) = 0. 
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D3. We write h = Ot; + ltg — 1t3. The algorithm returns the following tuple 
(23 + 2, —1, x? + 2,0, 1, —1) and stops. 


Therefore we have a representation 


f= (@? +2)9 —195 + (w? +2) 93 +1t1 —3t2 —1ts 
=(x* +2)9, —1 92 =e (x3 X) 93 Oty t lte 1t3 


These calculations show that the order of the polynomials does affect the 
outcome of the Border Division Algorithm. 


If we fix the tuple (g1,...,gv) then the result of the Border Division 
Algorithm is uniquely determined. The given polynomial f is represented 
in P/(g1,--.,9v) by the residue class of the linear combination cyt +---+¢,t,. 
We introduce a name for this linear combination. 


Definition 4.2.12. Let O={t,,...,t,,} be an order ideal, let G = {g1,..-,9v} 
be an O-border prebasis, and let G = (g1,..-, gv). The normal O-remainder 
of a polynomial f with respect to G is 


NRog(f) = citi +--+ + Cuty 


where f = figit---+ frgv +eitit:::+cyt, is the representation computed 
by the Border Division Algorithm. 


Example 4.2.13. Let G = (g1, 92,93) and G’ = (9), 9,94) be the tuples con- 
sidered in Example 4.2.11. The above computations lead to 


NRo,g(f) =-3@-1 and NRog(f)=x2-y 


So the normal O-remainder depends on the ordering of the polynomials in G. 
In the next section we shall encounter a special kind of border prebasis for 
which this unwanted dependence disappears. 


An important consequence of the Border Division Algorithm is that the 
residue classes of the elements of O generate P/(gi,...,g) as a K-vector 
space. But, as the above examples show, this system of generators is not 
necessarily a basis. 


Corollary 4.2.14. Let O={t,,...,t,,} be an order ideal and G = {g1,..., gv} 
an O-border prebasis. Then the residue classes of the elements of O gener- 
ate P/(g1,.-.,gv) as a K-vector space. More precisely, the residue class of 
every polynomial f € P can be represented as a linear combination of the 
residue classes {ty,...,t,} by computing the normal remainder NRo.g(f) for 
GS igi 
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4.3 Border bases 


After all these preparations we are ready to introduce the fundamental notion 
of this article: border bases. They are special systems of generators of zero- 
dimensional ideals which do not depend on the choice of a term ordering, 
but the choice of an order ideal. We discuss their existence and uniqueness 
and compare them to Grobner bases of the given ideal. Then we show how 
one can use border bases to define normal forms, and we characterize border 
bases by the property that the associated multiplication matrices are pairwise 
commuting. 


4.3.1 Existence and uniqueness of border bases 


As above, let P = K[x1,...,£n] be a polynomial ring over a field kK. Moreover, 
let I be a zero-dimensional ideal in P. 


Definition 4.3.1. Let O = {ti,...,t,} be an order ideal andG = {g1,..., gv} 
an O-border prebasis consisting of polynomials in I. We say that the set G is 
an O-border basis of I if the residue classes of t),...,t, form a K-vector 
space basis of P/TI. 


Next we see that this definition implies that an O-border basis of I actually 
generates I. 


Proposition 4.3.2. Let O = {t1,...,t,} be an order ideal, and let G be an 
O-border basis of I. Then I is generated by G. 


Proof. By definition, we have (g1,..., 9.) © I. To prove the converse inclusion, 
let f € I. Using the Border Division Algorithm 4.2.10, the polynomial f can 
be expanded as f = figi +--+ fugv teitit-: ++ Cyt, where fi,...,f. © P 
and c,,...,¢, € K. This implies the equality of residue classes 0 = f= 
cit; +++: +c,t, in P/I. By assumption, the residue classes t),...,¢,, form a 
K-vector space basis. Hence cj = --- = c, = 0, and the expansion of f turns 
out to be f = figi + ---+ frgv. This completes the proof. 


Remark 4.8.8. Let O = {ti,...,t,} be an order ideal and G an O-border 
prebasis which generates an ideal I. We let (O)x = Kt, +---+ Kt, be the 
vector subspace of P generated by O. Then Corollary 4.2.14 shows that the 
residue classes of the elements of O generate P/I. Since the border basis prop- 
erty requires that these residue classes are linearly independent, the following 
conditions are equivalent. 


1. The set G is an O-border basis of I. 
2. We have IN (O) x = {0}. 
3. We have P= I@(O)xK. 
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Having defined a new mathematical object, it is natural to look for its 
existence and possibly its uniqueness. In the following theorem, we mention 
the field of definition of an ideal. For a discussion on this concept, see [KR00], 
Section 2.4. Furthermore, given an ideal J C P and a term ordering o, we 
denote the order ideal T” \ LT, (I) by O,(J). 


Theorem 4.3.4 (Existence and Uniqueness of Border Bases). 

Let O = {t1,...,t,} be an order ideal, let I be a zero-dimensional ideal in P, 
and assume that the residue classes of the elements in O form a K-vector 
space basis of P/TI. 


1. There exists a unique O-border basis G of I. 

2. Let G’ be an O-border prebasis whose elements are in I. Then G' is the 
O-border basis of I. 

3. Let k be the field of definition of I. Then we have GC k[ay,...,&n]. 


Proof. First we prove Claim 1. Let OO = {bi,...,b,}. For every i € {1,...,v}, 
the hypothesis implies that the residue class of b; in P/TI is linearly dependent 
on the residue classes of the elements of O. Therefore J contains a polynomial 
of the form b; — pee a,jt; such that aj; € K. Then G = {g1,...,g,} is 
an O-border prebasis, and hence an O-border basis of J by Definition 4.3.1. 
Let G’ = {g/,.--,g},} be another O-border basis of J. If, for contradiction, 
there exists a term b € 0O such that the polynomials in G and G’ marked 
by 6 differ, their difference is a non-zero polynomial in J whose support is 
contained in O. This contradicts the hypothesis and Claim 1 is proved. 

To prove the second claim, it suffices to observe that, by Definition 4.3.1, 
the set G’ is an O-border basis of J and to apply the first part. Finally, we 
prove Claim 3. Let & be the field of definition of J, let P’ = k[ax1,...,2n], 
and let I’ = IM P’. Given a term ordering o, the ideals I and I’ have the 
same reduced o-GrGébner basis (see [KR00], Lemma 2.4.16). Hence we have 
O,(1) = O,(I'), and therefore dim; (P’/I') = dim, (P/I). The elements of O 
are in P’ and they are linearly independent modulo I’. Hence their residue 
classes form a k-vector space basis of P’/I’. Let G’ be the O-border basis 
of I’. Then G’ is an O-border prebasis whose elements are contained in J. 
Thus the statement follows from Claim 2. 


Given an order ideal © consisting of dimx(P/I) many terms, does the 
O-border basis of I always exist? The answer is negative, as our next example 
shows. 


Example 4.3.5. Let P = Q|z, yj, and let IJ be the vanishing ideal of the set of 
five points X = {(0,0), (0, —-1), (1,0), (1, 1), (—1,1)} in the affine space A?(Q), 
ie. let l={f € P| f(p) = 0 for all p € X}. It is known that dimg(P/I) = 5. 
In T?, the following order ideals contain five elements: 

O, = (1, 2, #7, #3, 2}, Oo = {1, «, 2”, x, y}, O3 = {1, a, 23, y, y*}, 

O1={1, 2, 2%, y, cy}, O={l. zy yi}, Oo={l yyy yh, 

O7 = {1, L,Y, LY, y°} 
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Not all of these are suitable for border bases of I. For example, the residue 
classes of the elements of O; cannot form a K-vector space basis of P/I since 
x? —a € I. Similarly, the residue classes of the elements of Og cannot form a 


K-vector space basis of P/T since y? — y € I. 


So, let us strive for less and ask another question. Does a given zero- 
dimensional ideal possess a border basis at all? Using Theorem 4.3.4, we can 
rephrase the question in the following way. Given a zero-dimensional ideal J, 
are there order ideals such that the residue classes of their elements form a 
K-vector space basis of P/I? This time the answer is yes, as we can show 
with the help of Grobner bases. 

Given an order ideal O C T”, its complement T” \ O is the set of terms of 
a monomial ideal. Recall that every monomial ideal has a unique minimal set 
of generators (see [KR00], Proposition 1.3.11). The elements of the minimal 
set of generators of the monomial ideal corresponding to T” \ O are called the 
corners of O. A picture illustrates the significance of this name. 


ys 


Proposition 4.3.6. Let o be a term ordering on T”. Then there exists a 
unique O,(I)-border basis G of I, and the reduced o-Grobner basis of I is the 
subset of G consisting of the polynomials marked by the corners of O,(I). 


Proof. By Macaulay’s Basis Theorem (see [KR00], Theorem 1.5.7), the residue 
classes of the elements in O,(I) form a K-vector space basis of P/I. Thus 
Theorem 4.3.4.1 implies the existence and uniqueness of the O,(I)-border 
basis G' of I. 

To prove the second claim, we let b € T” \ O,(1) be a corner of O, (1). 
The element of the minimal o-Grobner basis of J with leading term b has the 
form b — NF,.7(b), where NF,,7(b) is contained in the span of O,(J). Since 
the O,(I)-border basis of I is unique, this Grébner basis element agrees with 
the border basis element marked by b. Thus the second claim follows and the 
proof is complete. 


To summarize the discussion, the ideal J does not necessarily have an 
O-border basis for every order ideal O consisting of dimg(P/I) terms, but 
there always is an O-border basis if O is of the form O = O,(J) for some term 
ordering o. This motivates our next question. Do all border bases belong to 
order ideals of the form O,(1)? In other words, is there a bijection between 
the reduced Grébner bases and the border bases of J? The answer is no, as 
our next example shows. 
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Example 4.3.7. Let P = Q{z,y], and let X Cc A?(Q) be the set of points 
X = {pi,p2,p3,pa,P5)}, where p, = (0,0), pe = (0,—1), ps = (1,0), 
pa = (1,1), and ps = (—1,1). Furthermore, let IT C P be the vanishing 
ideal of X (see Example 4.3.5). The map eval : P/I —> Q?° defined by 
f+I (f(p1),-.--,f(ps)) is an isomorphism of K-vector spaces. 

Consider the order ideal O = {1,2,y,x7,y?}. The matrix of size 5 x 5 
whose columns are (eval(1), eval(x),...,eval(y?)) is invertible. Therefore the 
residue classes of the terms in O form a Q-vector space basis of P/I, and I has 
an O-border basis by Theorem 4.3.4.1. 

The border of O is 0O = {xy, x3, y?, xy”, x2y}. The O-border basis of I is 
G = {gi ---,95} with g = 2° — 2, go =a? y — 5y— 9Y", 93 = By —B— Zyt 
a? — by”, g4 = zy? —x— Sy+27 — Sy’, and gs = y® — y. To show that this 
border basis is not of the form O,(J), consider the polynomial g3 in more 
detail. For any term ordering o we have 2? >, x and y”? >, y. Moreover, 
either 2? >, ry >, y’ or y? >, ty >> Z’. This leaves either x? or y* as the 
leading term of g3. Since these terms are contained in O, the order ideal O 
cannot be the complement of LT,(J) in T? for any term ordering o. 


The upshot of this example is that the set of border bases of a given zero- 
dimensional ideal is strictly larger than the set of its reduced Grobner bases. 
Therefore there is a better chance of finding a “nice” system of generators 
of I among border bases than among Grobner bases. For instance, sometimes 
border bases are advertised by saying that they keep symmetry. While this is 
true in many cases, the claim has to be taken with a grain of salt. Just have 
a look at the following example. 


Example 4.3.8. Let P = Q{z,y] and I = (a? + y?—1, xy —1). The ideal T is 
symmetric with respect to the indeterminates « and y. Moreover, we have 
dimx(P/I) = 4. The only symmetric order ideal consisting of four terms is 
O = {1, 2, y, cy}. But I does not have an O-border basis, since we have 
xy —1€ I. It may be interesting to observe that the residue classes of the 
elements 1, 2 — y, + y, x? — y? form a K-vector space basis of P/T. 


Let us investigate the relationship between Grobner bases and border bases 
a little further. A list (or a set) of marked polynomials ((g1, b1),..-, (gv, b,)) 
is said to be marked coherently if there exists a term ordering o such that 
LT, (gi) = b; for i= 1,...,v. Furthermore, recall that an O-border (pre)basis 
can be viewed as a tuple of polynomials marked by terms in the border of O. 


Proposition 4.3.9. Let O be an order ideal such that the residue classes of 
the elements of O form a K-vector space basis of P/I. Let G be the O-border 
basis of I, and let G’ be the subset of G consisting of the elements marked by 
the corners of O. Then the following conditions are equivalent. 


1. There exists a term ordering o such that O = O,(1). 
2. The elements in G’ are marked coherently. 


190 A. Kehrein, M. Kreuzer, and L. Robbiano 


8. The elements in G are marked coherently. 


Moreover, if these conditions are satisfied, then G’ is the reduced o-Grobner 
basis of I. 


Proof. Let us prove that 1) implies both 2) and the additional claim. The fact 
that G’ is the reduced o-Grébner basis of I follows from Proposition 4.3.6. 
Hence G’ is marked coherently. Now we show that 2) implies 3). For every 
polynomial g € G\G’, there exists a polynomial g’ € G’ such that the marked 
term of g is of the form b = t LT,(g’). Then the support of the polynomial 
g —tg’ is contained in O, and therefore g = tg’. This proves that also g is 
marked coherently with respect to a. 

Since 3) = 2) is obvious, only 2) > 1) remains to be shown. Let o be a term 
ordering which marks G’ coherently. Denote the monomial ideal generated by 
the leading terms of the elements in G’ by LT,(G’). Since LT, (I) > LT,(G’), 
we get O,(I) = T” \ LT, (J) C T” \ LT,(G’) = O. Also the residue classes 
of the elements of O,(I) form a K-vector space basis of P/I, and hence the 
inclusion is indeed an equality. 


The proposition applies for instance to the monomial ideal J generated 
by the corners of O. Later we shall see that the equivalent conditions of this 
proposition apply for a particular type of zero-dimensional ideals, namely the 
vanishing ideals of distracted fractions (see Example 4.4.5). The following 
remark will be useful in the last section. 


Remark 4.8.10. Assume that there exists a term ordering o such that every 
corner of O is o-greater than every element in O. Then we have O = O,(J) 
for all ideals I such that the residue classes of the terms in O form a K-vector 
space basis of P/I. We do not know whether the converse holds, but we believe 
it does. 


4.3.2 Normal forms 


In Grobner basis theory one can define a unique representative of a residue 
class in P/I by using the normal form of a polynomial f. The normal form 
is obtained by computing the normal remainder of f under the division by 
a Grobner basis. It does not depend on the Grobner basis, but only on the 
given term ordering and the ideal J. Hence it can be used to make the ring 
operations in P/I effectively computable. In this subsection we imitate this 
approach and generalize the normal form to border basis theory. 

Let O = {ti,...,t,} be an order ideal, let G = {m,...,g,} be the 
O-border basis of a zero-dimensional ideal I, and let G be the tuple (g1,..., 9). 
In this situation the normal O-remainder of a polynomial does not depend on 
the order of the elements in G. 


Proposition 4.3.11. Let 7: {1,...,v} —> {1,...,v} be a permutation, and 
let G' = (Gn(1):-++59n(v)) be the corresponding permutation of the tuple G. 
Then we have NRo,g(f) = NRo.g(f) for every polynomial f € P. 
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Proof. The Border Division Algorithm applied to G and G’, respectively, yields 
representations 


f=figi +--++ fugv + NRo,g(f) = figaay t+++ + f09m(v) + NRo,g:(f) 


where fj, f; € P. Therefore we have NRo,g(f) — NRo.g(f) € (O)K NI. The 
hypothesis that J has an O-border basis implies (O) « I = {0}. Hence the 
claim follows. 


This result allows us to introduce the following definition. 


Definition 4.3.12. Let O={t,,...,t,,} be an order ideal and G = {g1,..., gv} 
an O-border basis of I. The normal form of a polynomial f € P with respect 
to O is the polynomial NFo,1(f) = NRo,g(f). 


The normal form NFo,7(f) of f € P can be calculated by dividing f by 
the O-border basis of J. It is zero if and only if f € J. Further basic properties 
of normal forms are collected in the following proposition. 


Proposition 4.3.13 (Basic Properties of Normal Forms). 
Let O be an order ideal, and suppose that I has an O-border basis. 


1. If there exists a term ordering o such that O = O,(I), then we have 
NFo,7(f) = NF,.7(f) for all f € P. 

For fi, fo € P, we have NFo,1(fi — fe) = NFo,1(fi) — NFo,1(fe2). 

For f € P, we have NFo1(NFo,1(f)) = NFo.1(f). 

For fi, fo € P, we have NFo,1(fi f2) = NFo,1(NFo,1(fi) NFo,1(f2))- 
Let My,...,;Mn © Mat,(K) be the matrices of the multiplication endo- 
morphisms of P/I with respect to the basis given by the residue classes of 
the terms in O. Suppose that t; = 1, and let e, be the first standard basis 
vector of K”. Then we have 


as fo 


NFo,7(f) => (tf, abel te ity) : f(Ma,. ae Mn) oa 
for every f € P. 


Proof. Claim 1) follows because both NFo,7(f) and NF,,7(f) are equal to the 
uniquely determined polynomial in f + J whose support is contained in O. 
Claims 2), 3), and 4) follow from the same uniqueness. To prove the last 
claim, we observe that e; is the coordinate tuple of 1+ J in the basis of P/I 
given by the residue classes of the terms in O. Since M, is the matrix of the 
multiplication by 2;, the tuple f(Mj,...,M,)- e1 is the coordinate tuple 
of f +J in this basis. From this the claim follows immediately. 
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4.3.3 Border bases and commuting matrices 


The purpose of this subsection is to provide the link between border bases 
and the theory of commuting endomorphisms discussed in the second section. 
More precisely, we shall characterize border bases by the property that their 
corresponding formal multiplication matrices commute. 

Let O = {t1,...,t,} be an order ideal with border OO = {bi,...,b,}, 
and let G = {g1,...,g,} be an O-border prebasis. For 7 = 1,...,v, we write 
gi = b; = ae Oi tj with Lj eee Ang EK. 

In Section 4.1 we saw that a K-vector space basis of P/I allows us to 
describe the multiplicative structure of this algebra via a tuple of commuting 
matrices. If G is a border basis, we can describe these matrices as follows. 


Remark 4.8.14. In the above setting, assume that G is a border basis. Then 
{ti,...,t,} is a K-vector space basis of P/I, and each multiplication endo- 
morphism X;, of P/I corresponds to a matrix V, = (&;), ie. 


Xe(t1) = rt +--+ + Eutty 


Xx (ty) _ Eiyutr all Ent 


In these expansions only two cases occur. The product x; t; either equals some 
term in the order ideal t, € © or some border term b, € OO. In the former 
case we have 


Xq(tj) = Ot, +--+ +001 +14, 4+ 00,41 +--++08,, 


ie., the j** column of 4%; is the r‘” standard basis vector e,. In the latter case 
we have at; +I = bs +I = aygti ++ +++ Qyst, +1, where the coefficients a;, 
are given by gs = bs — }>; aisti. Therefore we have 


Xx(t;) — Qisti os ae Oust 


ie., the j** column of 4X, is (A1s,---, Qs)". Observe that all matrix compo- 
nents €;; are determined by the polynomials g1,..., gv. 


In view of this remark, at least formally, multiplication matrices can be 
defined for any border prebasis. 


Definition 4.3.15. Let O={t),...,t,} be an order ideal andG = {g1,...,gv} 
an O-border prebasis. For 1 <k <n, define the k*® formal multiplication 
matrix 1, column-wise by 


(Oixpsecitias),. Of @ete = by 


fe if Ly t; = ty 
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To get some insight into the meaning of this definition, let us have a look 
at example 4.3.7 “from the outside.” 


Example 4.3.16. Let P = Q[x,y], and let O = {t1, to, t3,t4,ts} be the order 
ideal given by t; = 1, tg = 2, t3 = y, ta = 27, and ts; = y?. The border 
of O is OO = {by, be, b3, ba, bs} where 6; = ry, bo = 2°, b3 = y?, ba = x7y, 
and b; = ay”. The polynomials g, = xy — x — sy + xe — sy", go = v2 —- 2, 
ga = 9? — 9, 94 = 2" y — Gy — Gy", and gs = ay" — 2 — fy +2" — jy" define a 
border prebasis of I = (g1,...,95). Now we compute the formal multiplication 
matrices ¥ and )Y. 

On the one hand, we have rt, = to, ete = ty, rtz3 = b1, ety = bo, and 
ats = bs. On the other hand, we have yty = ts, y to = b1, y t3 = ts, y ta = ba, 
and yts = bs. Thus we obtain 


00000 00000 
0b <4 01000 
X= |001/201/2} and y=]11/201/21 
ee eee T-19 0 0 
001/201/2 01/211/20 


By Example 4.3.7, this border prebasis is even a border basis of J. Hence the 
formal multiplication matrices are the actual multiplication matrices. As such 
they commute. 


The following theorem is the main result of this subsection. We charac- 
terize border bases by the property that their formal multiplication matrices 
commute. A more general theorem is contained in Chapter 3. 


Theorem 4.3.17 (Border Bases and Commuting Matrices). 

Let O = {ti,...,t,} be an order ideal. An O-border prebasis {g1,..., gv} is 
an O-border basis of I = (g1,..-,9v) if and only if its formal multiplication 
matrices are pairwise commuting. In that case the formal multiplication ma- 
trices represent the multiplication endomorphisms of P/I with respect to the 
basis {t1,...,ty}. 


Proof. Let *1,...,€, be the formal multiplication matrices corresponding to 
the given O-border prebasis G = {g1,...,g.}. If G is an O-border basis, then 
Remark 4.3.14 shows that 1,...,¥, represent the multiplication endomor- 


phisms of P/I. Hence they are pairwise commuting. 
It remains to show sufficiency. Without loss of generality, let t; = 1. The 
matrices ¥,,...,¥, define a P-module structure on (Q) « via 


f (citi +... ceuty) = (Ct, 0, te) f(Aiys.-, Xn) (er, <0 G)™ 


First we show that this P-module is cyclic with generator t;. To do so, we 
use induction on the degree to show ¢;-t, = t; for i=1,...,y. The induction 
starts with t) = (t1,...,t,)Z,-e1. For the induction step, let t; = x; ty. Then 
we have 


194 A. Kehrein, M. Kreuzer, and L. Robbiano 


ti ‘ ty = (tf, esas vty )ti(X, Sess , Kn )e1 = (fh, ae ty )X; th (4, sey Xn )e1 
=> (tf, eeee typ )Xjex => (tf, ere ity ei => t; 
Thus we obtain a surjective P-linear map O : P > (O)« such that f+ f- ty 
and an induced isomorphism of P-modules 9: P/J — (O)« with J = ker O. 


In particular, the residue classes tj + J,...,t,+J are K-linearly independent. 
Next we show I C J. Let b; = x, t;. Then we have 


a 
gi (X1, oe .;Xn)e1 => b (X41, aati, »Xn)e1 = oy aizti(X, ine »Xn)e1 
w=1 


bt Lh 
= X, t1(%1, = ., Xn )e1 = > Aijei = X, ey = > jet 
i=l i=1 


I 
Me 


LU 
Ai ei — > Aye, = 0 
1 i=1 


a 
Il 


Therefore we have gj € ker O for j =1,...,v and I C J, as desired. 

Hence there is a natural surjective ring homomorphism W: P/I — P/J. 
Since the set {t; +J,...,t, +I} generates the K-vector space P/I, and since 
the set {t1 + J,...,t,+J} is K-linearly independent, both sets must be bases 
and I = J. This shows that G is an O-border basis of I. 


The following example shows that the formal multiplication matrices cor- 
responding to an O-border prebasis are not always commuting. 


Example 4.3.18. Let P = Q|a, y| and O = {t1, to, ts, t4,ts} with ty = 1, t2 = 2, 
t3 = y, tg = x”, and ts = y?. Then the border of O is OO = {by, be, b3, ba, ba} 
with b} = ry, bo = 2°, bg = y?, bg = xy, and bs; = xy”. Consider the set 
of polynomials G = {91, 92,93, 94,95} with g) = xy — 2? — y”, go = x3 — 2’, 
g=y—y’, ga = x?y — x, and gs = ry? — y’. It is an O-border prebasis of 


the ideal I = (g1,...,g5). Its multiplication matrices 
00000 00000 
10000 00000 
x= 100000 and Y={10000 
01110 01010 
00101 01101 


do not commute: 


00000 00000 
00000 00000 
VX-Y=100000 FA Y-X=100000 
11010 11110 
11101 10101 


By the theorem, the set G is not an O-border basis of I. 
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The condition that the formal multiplication matrices of a border basis 
have to commute can also be interpreted in terms of the syzygies of that 
basis (see [Ste04]). Based on the results of this section one can now imitate 
the development of Grobner basis theory for border bases. For instance, the 
border basis analogues of the conditions A — D which characterize Groébner 
bases in [KRO0], Chapter 2, are examined by the first two authors in [KKO3al. 


4.4 Application to statistics 


Fifty percent of the citizens of this country 
have a below average understanding of statistics. 
(Anonymous) 


In this last section we see how to solve a problem in computational commu- 
tative algebra whose motivation comes from statistics. Does this sound strange 
to you? Well, come and see. Our problem comes up in the branch of statistics 
called design of experiments. If you want to get a more detailed understanding 
of this theory, we suggest that you start exploring it by reading [Rob98]. Or, 
if you prefer the statisticians’ point of view, you can consult [PRWO0]. 

To get to the heart of the problem, let us introduce some fundamental 
concepts of design of experiments. A full factorial design is a finite set of 
points in affine space A"(K’) © K” of the form D = D, x---x Dy, where D; is 
a finite subset of kK. Associated to it we may consider the vanishing ideal Ip = 
{f © P| f(p) = 0 for all p € D}. It isa complete intersection Ip = (f1,..-, fn) 
such that f; € K[ax;] is a product of linear forms for 7 = 1,...,n. For instance, 
in A?(Q) we have the full factorial design 


x 


whose vanishing ideal in Q[x, y] is Ip = (a(x—1)(a—2)(a—3), y(y—1)(y—2)). 
The particular shape of the generators of Ip implies that they are the reduced 
o-Grobner basis of Ip with respect to any term ordering o. Hence the order 
ideal Op = T” \ LT, (Jp) is canonically associated to D. In the example at 
hand we have for instance 


Op ={1, 2, y, 2", ty, y*, @, zy, cy’, zy, zy’, z*y"} 


If a particular problem depends on n parameters and each parame- 
ter can assume finitely many values D; C K, the full factorial design 
D=D,x.---x D, corresponds to the set of all possible experiments. The 
main task in the design of experiments is to identify an unknown function 
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f : D — K. This function is a mathematical model of a quantity which 
has to be computed or optimized. Since it is defined on a finite set, it can 
be determined by performing all experiments in D and measuring the value 
of f each time. Notice that a function f defined on a finite set is necessarily 
a polynomial function. 

However, in most cases it is impossible to perform all experiments corre- 
sponding to the full factorial design. The obstacles can be, for instance, lack 
of time, lack of money, or lack of patience. Only a subset of those experiments 
can be performed. The question is how many and which? In statistical jargon 
a subset F’ of a full factorial design D is called a fraction. Our task is to 
choose a fraction F' C D that allows us to identify the model. In particular, 
we need to describe the order ideals whose residue classes form a K-basis 
of P/Ir. Statisticians express this property by saying that such order ideals 
(or complete sets of estimable terms, as they call them) are identified by F. 

Even more important is the so-called inverse problem. Suppose we are 
given an order ideal O. We would like to determine all fractions F C D 
such that the residue classes of the elements of O form a K-basis of P/Ip. 
The main result of [CR97] was a partial solution of this inverse problem. More 
precisely, all fractions F C D were found such that O = O, (Ir) for some term 
ordering o. However, we have already pointed out that some order ideals O 
do not fit into this scheme (see Example 4.3.7). Later, in the paper [CRO1] 
the full solution was presented, and the main idea was to use border bases. 

Before delving into the general solution of the inverse problem following 
the technique employed in [CRO1], let us briefly explain an example of an 
actual statistical problem. This example is taken from [BHH78] and adapted 
to our setting and terminology. 


Example 4.4.1. A number of similar chemical plants had been successfully 
operating for several years in different locations. In a newly constructed plant 
the filtration cycle took almost twice as long as in the older plants. Seven 
possible causes of the difficulty were considered by the experts. 


1. The water for the new plant was somehow different in mineral content. 
2. The raw material was not identical in all respects to that used in the older 
plants. 

3. The temperature of filtration in the new plant was slightly lower than in 
the older plants. 

. A new recycle device was absent in the older plants. 

. The rate of addition of caustic soda was higher in the new plant. 

. A new type of filter cloth was being used in the new plant. 

. The holdup time was lower than in the older plants. 


NOD OF 


These causes lead to seven variables 71,...,27. Each of them can assume 
only two values, namely old and new which we denote by 0 and 1, respectively. 
Our full factorial design D C A‘(Q) is therefore the set D = {0,1}". Its 
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vanishing ideal is Ip = (af —21, 23 —a2,..., x? — 27) in the polynomial ring 
P= Qlx1, x2, tee x7]. 

The model f : D — Q is the length of a filtration cycle. In order to 
identify it, we would have to perform 128 cycles. This is impracticable, since 
it would require too much time and money. On the other hand, suppose for a 
moment that we conduct all experiments and the output is f =a+b271+c2x2 
for some a,b,c € Q. At this point it becomes clear that we wasted many 
resources. Had we known in advance that the polynomial has only three un- 
known coefficients, we could have identified them by performing only three 
suitable experiments! Namely, if we determine three values of the polynomial 
a+ba2,+c22, we can find a,b,c by solving a system of three linear equations 
in these three indeterminates. If the matrix of coefficients is invertible, this is 
an easy task. 

However, a priori one does not know that the answer has that shape indi- 
cated above. In practice, one has to make some guesses, perform well-chosen 
experiments, and possibly modify the guesses until the process yields the de- 
sired answer. In the case of the chemical plant, it turned out that only x; 
and a5 were relevant for identifying the model. 


In this example there is one point which needs additional explanation. How 
can we choose the fraction F' such that the matrix of coefficients is invertible? 
In other words, given a full factorial design D and an order ideal O C Op, 
which fractions F C D have the property that the residue classes of the ele- 
ments of O are a K-basis of P/I? This is precisely the inverse problem stated 
above. In order to explain its solution, we introduce the following terminology. 


Definition 4.4.2. Fori=1,...,n, let 0; > 1 and Dj = {aj1, ai2,... , aie, } C 
K. Then we say that the full factorial design D = D, x --- x Dy, C A"(K) 
has levels (¢1,...,). 


The polynomials f, = (%;—ai1) +++ (%j—axe,) witht = 1,...,n generate the 
vanishing ideal Ip of D. They are called the canonical polynomials of D. 
Since {fi,..., fn} is a universal Grobner basis of Ip (i.e. a Grébner basis 


with respect to every term ordering), the order ideal 
Op = {aft --- 20" |0< a; < & for i=1,...,n} 


represents a K-basis of P/Ip. We call it the complete set of estimable 
terms of D. 


The following auxiliary result will be useful for proving the main theorem. 


Lemma 4.4.3. Let D be a full factorial design, let {f,,..., fn} be its canon- 
ical polynomials, let K be the algebraic closure of K, and let I be a proper 
ideal of K[a1,...,%p] such that Ip C I. 


1. The ideal I is a radical ideal. It is the vanishing ideal of a fraction of D. 
2. The ideal I is generated by elements of P, and IN P is a radical ideal. 
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3. The polynomials of every border basis of I are elements of P. 


Proof. First we prove Claim 1. Let A” (ix) be the affine space of dimension n 
over K, and let F C A"(K) be the set of zeros of I. Since Ip C I, we have 
F C D. By localizing the ring A = K[21,...,2%n]/Ip at the maximal ideals m 
corresponding to the points of d, we see that either JA = (1) or [Am = MA. 
Therefore I is a radical ideal, and hence it is the defining ideal of F’. 

Since I is the defining ideal of a finite set of points with coordinates in Kk, 
it is the intersection of ideals generated by linear forms having coefficients 
in kK. Consequently, the ideal I is defined over K which proves Claim 2. The 
third claim follows from Theorem 4.3.4. 


Now we are ready to state the main result of this section. Our goal is to 
solve the inverse problem. The idea is to proceed as follows. We are given a 
full factorial design D and an order ideal O. By Theorem 4.3.4, ideals I such 
that O represents a K-basis of P/I are in 1-1 correspondence with border 
bases whose elements are marked by the terms in OO. Except for the bor- 
der basis elements which are canonical polynomials of D, we can write them 
down using indeterminate coefficients and require that the corresponding for- 
mal multiplication matrices are pairwise commuting. For J to be the vanishing 
ideal of a fraction contained in D, we have to make sure that J contains Ip. To 
this end, we require that the normal O-remainders of the canonical polynomi- 
als of D are zero. By combining these requirements, we arrive at the following 
result. 


Theorem 4.4.4 (Computing All Fractions). 

Let D be a full factorial design with levels (€1,...,€n), and let O = {ti,...,t,} 
be a complete set of estimable terms contained in Op with ti = 1. Consider 
the following definitions. 


1. Let C={fi,..., fn} be the set of canonical polynomials of D, where f; is 
marked by a fori=1,...,n. 

2. Decompose OO into AO, = {a} ,...,a°"} NGO and AOz = BO \ AOy. 

3. Let C, be the subset of C marked by 00,1, and let Cp = C\ C,. 

4. Let ny = #(002). For i =1,...,n and j =1,...,p, introduce new inde- 
terminates 2;;. 

5. For every by € OO2, let gp = be — a Zejty © K(zj)[@1,.--,2n]- 

6. Let G = {q1,.--,9n} and H = GUC. Let Mi,...,My be the formal 
multiplication matrices associated to the O-border prebasis H. 

7. Let I(O) be the ideal in K|z;;] generated by the entries of the matrices 
MiM; —M;M; for 1 <i< jg <n, and by the entries of the column 
matrices f(M1,...,Mn)-e1 for all f € Co. 


Then T(O) is a zero-dimensional ideal in K[z,;] whose zeros are in 1-1 
correspondence with the solutions of the inverse problem, i.e. with fractions 
F CD such that O represents a K-basis of P/Ip. 
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Proof. Let p= (a11,---,Qun) € K’” be a zero of Z(O). When we substitute 
the indeterminates z;; by the coordinates of p in the matrices M,,...,Mn, 
we obtain pairwise commuting matrices M,,...,M,, which feature the addi- 
tional property that f(Mj,,..., Mn) -e: =0 for every f € Co. 

Now we substitute the coordinates of p in the polynomials of G and get 
polynomials 9, = beat a,jt; € P. Then we form the sets G = {91,..., Gn} 
and H = GUC}, and we let I be the ideal generated by H. Since the set H 
is an O-border prebasis of the ideal generated by it, the set H is an O-border 
prebasis of I. Moreover, the fact that M,,...,M, are the formal multipli- 
cation matrices of H implies that Mj,...,M,, are the formal multiplication 
matrices of H. Hence we can apply Theorem 4.3.17 and conclude that H is 
the O-border basis of I. 

By definition, we have C, C J. Using Proposition 4.3.13.5, we see that 
f(My,...,Mn) +e: = 0 implies NFo7(f) = 0, and therefore f € T for 
all f € Cy. Altogether, we have C = Cy UC, C I, and thus Ip C I. By 
Lemma 4.4.3.1, it follows that I is the vanishing ideal of a fraction of D. 

Conversely, let F' be a fraction of D such that O represents a K-basis 
of P/Ir. Consider the O-border basis B of Ip and write B = B, U Bg such 
that B, contains the polynomials marked by 0Q, and Bp» contains the poly- 
nomials marked by OO2. Since OO, C OOp, the polynomials in B, have the 
shape required for Op-border basis elements of Ip, i.e. they agree with the 
polynomials in C;. The polynomials in By are of the form g, = beet Apjt; 
where by, € OO2 and ax; € K. Let p = (aij) € K". We claim that p is a zero 
of Z(O). 

The point p is a zero of the entries of the matrices M;M,; —M,;M, for 
1<i<j <n, since the matrices M,,...,M, obtained by substituting p 
in M,,...,M, are the formal multiplication matrices of B and thus commute 
by Theorem 4.3.17. The point p is a zero of the entries of f(M1,...,M,)-e1 for 
f € Ca, since f(Mi,...,Mn) - e1 equals NFo,7,.(f) by Proposition 4.3.13.5, 
and this normal form is zero because f € C2 C Ip C Ip. Altogether, we have 
shown that p is a zero of Z(Q), as claimed. 


Using distracted fractions (see [RR98]), one can show that there always 
exists at least one solution of the inverse problem. Let us look at an example 
to illustrate the method. 


Example 4.4.5. Let D be the full factorial design D = {0,1,2,3} x {0,1,2} 
contained in A?(Q), and let O = {1, 2, y, x”, xy, y?, x3, x7y} C Op. The 
order ideal O can be visualized as follows. 


y 
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We want to find a fraction F C D such that O represents a K-basis 
of P/Ir. One solution is to use the distracted fraction whose points are ex- 
actly the points marked by bullets in the above sketch, i.e. the following set 
F = {(0,0), (0,1), (0,2), (1,0), (1,1), (2,0), (2,1), (3,0)}. An easy computa- 
tion shows that the vanishing ideal of F is 


Ip = (a(x —1)(x — 2)(u — 3), x(a — 1)(@ — 2)y, ayy — 1, y(y— 1)(y — 2) 


Moreover, these three generators are a universal Grobner basis of Jp and 
O,([r) = O for every term ordering o. 


We end this section with two examples intended to explain how Theo- 
rem 4.4.4 solves the inverse problem. 


Example 4.4.6. Let D be the full factorial design D = {—1,0,1} x {-1,1} 
with levels (3,2) contained in A?(Q). The complete set of estimable terms 
of D is Op = {1, x, y, x, xy, x?y}. We want to solve the inverse problem for 
the order ideal O = {1, x, y} and follow the steps of Theorem 4.4.4. 


. The set of canonical polynomials of D is C = {f1, fo}, where f, = 2? — x 


and fo =y?-1. 

We decompose 0O = {x?, xy, y?} into OO, = {y?} and 002 = {2?, xy}. 
Let Ci = {fo} and C2 = {fi}. 

Let 7 = 2. Choose six new indeterminates 211, 212, 213, 221, 222, 223- 
Define g) = #7 — (211. + z12@ + z13y) and go = wy — (221 + 222@ + 293Y). 
Let G = {g1, g2} and H = {g1, go, fo}. The formal multiplication matrices 
associated to H are 


— 


Oak wn 


0 241 221 0 za, 1 
MM, = {1 212 ~22 and Mo = 10 222 0 
0 213 223 1 223 0 


7. Let Z(O) C Qizu,...,223] be the ideal generated by the entries of the 
matrices MMe = MoM, and fi(M1, Ma) °ey = (M3 = M1) * €4. We 
obtain Z(O) = (212221 — 211222 — 221223 + 213, 221222 + 223, 222223 + 221, 
2 —1, 213222 — 212203 + 255 — 21, 222%3 + 221, 211212 + 213221; 
ze + 213202 + 211 — 1, 212213 + 213223). 


Using a computer algebra system, for instance CoCoA, we can check 
that Z(O) is a zero-dimensional, radical ideal of multiplicity 18. This means 
that among the 20 = (3) triples of points of D, there are 18 triples which 
solve the inverse problem. The two missing fractions are {(0,0), (0,1), (0, 2)} 
and {(1,0), (1,1), (1, 2)}. 


When we apply the theorem to larger full factorial designs, the calculations 
involved in determining the zeros of Z(Q) quickly become voluminous. 
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Example 4.4.7. Let D be the full factorial design D = {—1,0,1} x {—1,0,1} 
with levels (3,3) contained in A?(Q). The complete set of estimable terms 
of D is Op = {1, x, y, x7, ry, y’, x7y, ry”, x7y?}. We want to solve the in- 
verse problem for the order ideal O = {1, x, y, x7, y?} and follow the steps 
of Theorem 4.4.4. 


1. 


ew 


The set of canonical polynomials of D is C = {f1, fo}, where fi = a —24 
and fo = x — 29. 


. We decompose 0O = {x?, x7y, ry, ry”, y>} into OO, = {x°, y>} and 


OO = {x y, xy, xy}. 


. Let Cy = {fi1, fo} and Cp = 0. 
. Let 7 = 3. Choose 15 new indeterminates 211, 212,..., 235. 
. Define gy = x7y — (211 + 2122 + 213Y + 21407 + z15y) and go = ry—(za1+ 


2 2 2 2 2 
Z92k+293Yt 2240 +295y~) and g3 = xy*—(231+232@+ 233y+2340" +235"). 


. Let G = {91, 92,93} and H = {91, 92,93, fi, fo}. The formal multiplication 
matrices associated to H are 
00 221 0 231 0 z21 0 211 0 
10 292 1 232 0 za9 0 z12 0 
M, = 00 293 0 233 Mz = 1 293 0 213 1 
O01 zo4 0 234 0 zo4 0 z4 0 
00 295 0 235 0 295 1 z15 0 


. Let Z(O) be the ideal in Q[z11,..., 235] generated by the entries of the 


matrix M,;M ,—M2M,. Thus (0) is the ideal generated by the following 
20 polynomials: 


421423 1 225231 — 711 221222 T 211424 — 231 
413421 T 215431 — 221 221232 1 211434 — 221 
222293 + 295232 — 212 + 221 + 224 2q + 212204 — 232 
213222 + 215232 + 211 + 214 — 222 222232 T 212234 — 222 
Bes + 225233 — 213 292293 + 213224 + 221 + 225 — 233 
213223 T 215233 — 223 223232 + 213234 — 223 + 231 + 235 
293224 + 295234 — 214 + 222 214224 T 222224 — 234 
213224 + 215234 + 212 — 224 224232 T 214234 — 224 
223225 1 225235 — 215 215224 + 222295 + 223 — 235 
413425 T 215235 — £25 225232 + 215234 — 225 + 233 


Again we can use a computer algebra system and check that Z(O) is a zero- 
dimensional, radical ideal of multiplicity 81. This means that among the 
126 = (?) five-tuples of points in D there are 81 five-tuple which solve the 
inverse problem. 


One of the zeros of Z(O) is the point p € Q!° whose coordinates are 


1 Hl 
211 = 0 212 0213 =—-5 214 =O 215 = 5 
1 1 
291 = 0 299 = —1 293 = —5 22a = 1 205 = —5 
1 1 
231 = 0 232 = —1 233 = —5 234 = 1 235 = —5 
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The corresponding O-border basis is {a — x, 2?y— $y— $y’, xy—a@— Syt+ 
a? — sy’, cy?—a«—sy+2?—ty?, y>—y}. The fraction defined by this basis 
is 

y= {(0,0), (0, =1), (1,0), (1, 1), (=1, 1)} 


This is our old friend of Example 4.3.7! 

In view of our discussion in Section 4.3.1 it is natural to ask how many 
of the 81 fractions F' found above have the property that O is not of the 
form O,(Ir) for any term ordering a. We have seen in Example 4.3.7 that at 
least the fraction Fo is of that type. By combining Theorem 4.4.4 and some 
techniques discussed in [CR97], one can show that 36 of those 81 fractions 
are of that type. This is a surprisingly high number which shows that border 
bases provide sometimes a much more flexible environment for working with 
zero-dimensional ideals than Grobner bases do. 


There will never be a last tango 
(Brad Hooper) 
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5.0 Introduction 


These lectures, prepared for the CIMPA School on “Systems of polynomial 
equations” (Argentina, 2003), have two goals: to present the underlying ideas 
and tools for computing primary decompositions of ideals, and to apply these 
techniques to a recent interesting class of ideals related to statistics. Primary 
decompositions generalize the notion of solving systems of polynomial equa- 
tions, to the cases where there are infinitely many solutions, or to the case 
when the multiplicity of solutions is important. 

Primary decompositions are an important notion both in algebraic geome- 
try and for applications. There are several algorithms available (the two closest 
to what we present are [GTZ88] and [SY96]). A good overview of the state of 
the art is the paper [DGP99]. Primary decompositions, and related computa- 
tions, such as finding minimal and associated primes, the radical of an ideal, 
and the equidimensional decomposition of an ideal, are all implemented in 
most specialized computer algebra systems, such as CoCoA [Roba], Macaulay 
2 [GS], and Singular [GPS01]. Several years ago, these algorithms and their 
implementations could handle only very small examples. Now, with improved 
implementations, and more efficient computers, larger ideals can be handled. 

However, if the number of indeterminates is large, the implemented algo- 
rithms often are unable to find a primary decomposition, or even to find the 
minimal primes. This is the case for many of the ideals associated to Bayesian 
networks that we consider here. 

Our first goal in these lectures is to describe some basic methods for ma- 
nipulating components of an ideal. We put these together into an algorithm 
for primary decomposition. We challenge our students to combine these tech- 
niques in novel ways to obtain more efficient useful algorithms. 


* The author would like to acknowledge partial financial support by the National 
Science Foundation through grant DMS-9970348. 
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Our second goal is to define some interesting ideals, called Markov ideals, 
associated to a Bayesian network. In applications, Bayesian networks have 
been used in many ways, e.g. in machine learning, in vision and speech recog- 
nition, in attempting to reconstruct gene regulatory networks, and in the 
analysis of DNA micro-array data. These Markov ideals provide a very inter- 
esting relationship between multivariate statistics and algebra and geometry. 
In these lectures, we do little more than provide a glimpse into this poten- 
tially powerful relationship. Here is one short glimpse: hidden variables in some 
Markov models correspond to secant loci of Segre embeddings of products of 
projective spaces (see [GSS] for details). 

These Markov ideals often have many components, and can have relatively 
complicated primary decompositions. We apply the techniques that we have 
learned to compute some of these primary decompositions. Instead of giving 
canned algorithms for computing primary decompositions, we will describe 
several tricks of the trade that can be used on a given ideal, to help find 
the primary decomposition “by hand” (although with the help of a computer 
algebra system!). It is likely that superior algorithms exist. Again, we challenge 
our students to find one! 

Lecture 1: We set up the situation, and describe the first two tools of 
computing primary decompositions: ideal quotients, and splitting principles. 
As an example, we find fixed points of some finite dynamical systems. 

Lecture 2: We define Bayesian networks and consider independence state- 
ments between a set of discrete random variables. Given a Bayesian network, 
we can associate an ideal, whose primary decomposition is often hard to com- 
pute, yet very likely carries interesting information. These ideals provide a 
striking new link between algebra/geometry and statistics. 

Lecture 3: We describe several more tools for computing primary de- 
compositions. We ask several questions: (1) How do we find zero divisors to 
use with our splitting principles? (2) How do we detect if an ideal is prime, 
or primary? The tools we develop include birational maps, and the flattener: 
a polynomial obtained by analyzing the fibers of a projection map. Both of 
these techniques rely heavily on a Grobner basis using product orders. We use 
Macaulay 2 to investigate these methods on a simple example. 

Lecture 4: In the final lecture, we put all of these techniques together 
and write relatively complete algorithms. A final technique that we address is 
removing redundancy in the computation as soon as possible. We also present 
some open problems related to the primary decompositions of Markov ideals. 

Throughout, we provide both straightforward and challenging exercises. It 
is worthwhile to do these! One important exercise is to prove each of the lem- 
mas and propositions which are presented without proof. During the lectures, 
we spend more time using these results than proving them, although we do 
include some proofs. 

A good elementary introduction to get ready for these lectures is the book 
by Cox, Little, and O’Shea [CLO97]. The first chapter of the recent book by 
Hal Schenck [Sch03a] introduces ideal quotients and primary decompositions 
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in a very nice way. His book also has Macaulay 2 examples throughout. A good 
overview of the known algorithms for primary decomposition is presented in 
[DGP99]. For delving more deeply into the Bayesian network material, look 
at [GSS], and the references contained in there. 

Example computer sessions are included for Macaulay 2 [GS]. This is a 
system that Dan Grayson and I have been working on for the last ten years. 
The system is freely available, and easy to install on most computers. The 
web page can be found on the Internet?. 


5.1 Lecture 1: Algebraic varieties and components 


Throughout these lectures, let & be a field, and let R = k[ax1,...,2,]. If 
J=(fi,..-,fr) C R is an ideal, we let 


V(J) ={p ek” | filp) =... = fr(p) = OF. 


If the base field k is algebraically closed, then there is a beautiful dictionary 
which relates the geometry of X = V(J) to algebraic properties of the ideal 
J. We refer the reader to Cox-Little-O’Shea [CLO97] or Schenck [Sch03a] for 
the details. (If the field is not algebraically closed, the dictionary still exists, 
but relates the algebra of J to properties of the scheme corresponding to J). 

For example, if the base field is algebraically closed, and if J C Risa 
prime ideal (that is, fg € J implies f € J or g € J), then V(J) is irreducible 
(that is, cannot be written as a union V(J,) U V(J2) of zero sets which are 
properly contained in V(J)). 

Every ideal J in R has a primary decomposition, that is, a decomposition 


J=QiN...NQr, 


where each Q; is primary (i.e. if fg € Q;, then f € Q; or g% € Q;, for some 
integer N.) The radical 


P=J/Q={gER|g™ €Q, for some N} 


is a prime ideal, and Q is called P-primary. 

The primary decomposition is called irredundant if each P; := /Q; is dis- 
tinct, and if removing any one term Q,; breaks the equality. Every primary 
decomposition can be pruned to obtain an irredundant primary decomposi- 
tion. 

If the primary decomposition is irredundant, then the P,,..., P, are called 
the associated primes of J. This set is independent of the particular (irredun- 
dant) primary decomposition. The minimal elements of this set of primes 
(with respect to inclusion) are called the minimal primes of J. The radical 


? http: //www.math.uiuc.edu/Macaulay2 
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of J is the intersection of these minimal primes. If P is a minimal prime, 
the corresponding primary ideal is unique (i.e. doesn’t depend on the specific 
irredundant primary decomposition). If P is an associated prime, but not min- 
imal, then P is called an embedded prime. The primary ideal of an embedded 
prime is not unique. 


Example 5.1.1. Let J = (xy) C kx, y]. Geometrically, the zero set ry = 0 is 
the union of the two coordinate axes. Algebraically, this is seen in the primary 
decomposition J = (1) (y). Both (a) and (y) are minimal primes. 

If J = (xy) C k[x,y], then geometrically, the zero set is the union of the 
x-axis and a “triple line” 2 = 0. The primary decomposition is J = (x3)N(y). 
Both associated primes are minimal, but this time the ideal (x*) is primary, 
but not prime. 


Example 5.1.2. Let J = (xy, xz) C k{x,y, z]. Geometrically, the zero set ry = 
xz = 0 is the union of the plane x = 0 and the line y = z = 0. The primary 
decomposition of J is J = (4%) N(y, 2). 


Example 5.1.3. Let J = (x, xy) C k[x,y]. For each N > 1, we obtain a 
different primary decomposition of J: 


J = (2) (2?,y) = (x) N(x”, ay, y™). 


The associated primes are P; = (x) and Pp = (x,y), where P, is the only 
minimal prime, and P: is embedded. The primary ideal Q; = (a) is the same 
no matter which primary decomposition we use, but the primary ideal Q2 
of Py depends on the decomposition. Geometrically, V(J) is simply the line 
x = 0. Thinking algebraically (or, using schemes), the zero set should really 
be considered as the union of this line, and a “fat” embedded point at the 
origin. 


Exercise 5.1.4. Find (by hand) a primary decomposition of the ideal J = 
(2°, xy*z, y°2*) C k[x,y, 2].- 


In these lectures, what computations concern us? Given J, we would like 
to be able to compute (in roughly increasing order of difficulty): 


The radical of J. 

The set of minimal primes of J. 

P-primary component Q of J, where P is a minimal prime, 
The set of associated primes of J. 

An irredundant primary decomposition of J. 


5.1.1 Tool #1: Ideal quotients 


One of the most important constructions in ideal theory is the operation of 
ideal quotient. 
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Definition 5.1.5 (Ideal quotient and saturation). If I C R is an ideal, 
and f € R, then define the ideal quotient 


(I: f):={ge Rl gf eT}, 
and the saturation of I by f: 
(I: f°) :={ge Rl gf™ € I, for some N}, 
This somewhat opaque definition gives little clue of their importance. 


Lemma 5.1.6. Let Q be a P-primary ideal, and let f € R. Then 
(a) If f € P, then (Q: f) =Q. 
(b) If f © P, but f Z Q, then (Q: f) is P-primary. 
(c) If f € Q, then (Q: f) = (1). 


An elementary fact, which follows directly from the definition, is that 


(ho): f=(h:f)N Uae: f), 
and so 


Lemma 5.1.7. If J = Q19:--NQ, is an irredundant primary decomposition 
of J, where Q; is P;-primary, and if f € Q; only if j => s+1, then 


J: f=(Q1:f)N---N(Qs: f) 
is an irredundant primary decomposition of J: f. 
Saturations have even simpler behavior. 


Lemma 5.1.8. Let Q be a P-primary ideal, and let f © R. Then 
(a) If f ¢ P, then (Q: f°) =Q. 
(b) If f © P, then (Q: f°) = (1). 


Lemma 5.1.9. Jf J= Q19:--:NQ, is an irredundant primary decomposition 
of J, where Q; is Pj-primary, and if f € P; if and only if 7 > s+ 1, then 


J: fF =QiN---NQs 
is an irredundant primary decomposition of (J: f°). 


This says that, geometrically, the components of V(J : f°) are precisely 
the components of V(J) which do not lie on the hypersurface f = 0. 

What makes ideal quotients so useful is that they may be computed using 
Grobner bases. 


Proposition 5.1.10. Let J C R= k[axy,...,an] be an ideal, where k is a ring 
(e.g. a field, or a PID), and let fe R. If L=J+(tf —1) Cklt,ai,..., 2p], 
then 

(Jif) = LO Rg; 005 yl: 
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This is not always the most efficient method to compute saturations. It 
also doesn’t allow one to compute ideal quotients easily. There are (at least) 
two further ways to compute ideal quotients which are often used: the reverse 
lexicographic order, and syzygies. We’ll describe the method using the reverse 
lexicographic order, but we'll leave out the syzygy method. 

If f = ap is a variable, and if J is homogeneous, then (J: #,,) and (J : x°°) 
may be computed using a single reverse lexicographic Grobner basis. The key 
insight is that if > is the term order (or ordering) in the following proposition 
and g is a homogeneous polynomial, then «,,|g if and only if x,,|in(g). 


Proposition 5.1.11 (Bayer). Let J C k[x1,...,an] be a homogeneous ideal, 
and let > be the graded reverse lexicographic order (GrevLex) with 4, >... > 
ty. If the Grobner basis of J is 


(Obscene tp ityecs gia} 


where gi = x’ih;, each a; > 1, and x, does not divide the h;, then 
(a) {a —"hy,..., 09 "hyp, hpyi,..-,hs is a Grobner basis of (I: an), and 
(b) {hi,..., hs} is a Grobner basis of J: f%. 


Exercise 5.1.12. This idea can be used to compute J: f and J: f° when 
f is not an indeterminate. 

(a) Show that if J is homogeneous, and f is homogeneous of degree d, then 
Bayer’s method applied to the homogeneous ideal J + (f — z), where z is a 
new variable having degree d, can be used to compute J: f and J: f°. 

(b) Show how to compute the homogenization of an ideal by using satura- 
tion. 

(c) Show how to use homogenization and the trick in (a), to compute J: f 
and J: f° when J and f are not necessarily homogeneous. 


Example 5.1.13. Consider the ideal J = (c? — bd, be—ad) C Q{a, b,c, d|. Notice 
that the plane c = d = 0 is contained in the zero set of J. Let’s look at this 
ideal in Macaulay 2. 

il : R = QQ[a..d]; 

i2 : J = ideal(c*2-b*d, b*c-ax*d) 


2 
02 = ideal (c - b*d, b*c - ax*d) 


02 : Ideal of R 
First, here is the primary decomposition of J: 
i3 : primaryDecomposition J 


2 2 
03 = {ideal (d, c), ideal (c - b*d, b*c - a*d, b - axc)} 


03 : List 


The reverse lexicographic order is the default in Macaulay 2: 


5 Computing primary decompositions 209 


i4 : gens gb J 
04 = | c2-bd bc-ad b2d-acd | 


1 3 
o4 : Matrix R <--- R 


ib < J 4d 


2 2 
o5 = ideal (c - b*d, b*c - a*d, b - axc) 


o5 : Ideal of R 
i6 : saturate(J,d) 


2 2 
06 = ideal (c - b*d, b*c - a*d, b - axc) 


06 : Ideal of R 
i7 : J == intersect (ideal(c,d),J:d) 


o7 = true 


5.1.2 Tool #2: Splitting principles 


The key technique on which almost all algorithms for primary decomposition 
are based is the following very simple lemma. 


Proposition 5.1.14. If (J: f°) = (J: f*), then 
Fa (Ff?) (Sf). 


Proof. Suppose that g € (J: f%) and also that g € (J, f°). We want to show 
that g € J.Sog=a+bf®, for some a € J and b € R. However, gf* € J, so 
bf” € J. Therefore b € (J: f°) =(J: f*), andsog € J. 


If a polynomial f satisfies (J: f) 4 J and f* ¢ J, for any @, we'll call f a 
splitting polynomial for J. As a simple exercise, show that there is no splitting 
polynomial for J if and only if J is a primary ideal. 

If we are only interested in finding the set of minimal primes, we may take 
the radicals of both sides to obtain: for any f € R, 


VI =JST fF? SS, f. 


Another useful splitting formula is: if fi fo... f, € J, then 


VIH= VTA... AAS, fr. 


If we have a way of finding, given an ideal J, a splitting polynomial for J, 
then we may build a recursive algorithm to compute a decomposition of J. 
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5.1.3 An example: Finite dynamical systems 


As an example, let’s consider finite dynamical systems: given a prime number 
p, let F, be the finite field with p elements, let R = F,[x1,...,2,], and let 
F : FY — Fj be defined by 


C= (Q1,-.-,@n) — CAi(@),<30 Fala) 


where f; € R. 
All finite dynamical systems can be written in this form: 


Exercise 5.1.15. Show that, for any natural number n > 0 and any function 
f : Fj; — Fy; there are polynomials g; € F,[x1,..,@n] such that f(a) = 
(91(@),---;9n(@)) for all a € FF. 


By iterating F’, we obtain a directed graph whose vertices are the p” points 
of Fy;, and there are directed edges from a to F(a). 

In this example, we are interested in finding the fixed points of F’, or more 
generally, of F" (apply F' r times) for some integer r. The fixed points of F’ 
are the zeros of the ideal J = (x1 — fi,..-,%n— fn) which have all coordinates 
in F,. The problem is that there may be solutions over an extension field of F, 
and we are not particularly interested in these solutions. Notice that if z € Fp, 
then x € F, if and only if x? — x = 0. So, if we include these polynomials, 
then our zero set will only contain elements of the field we are interested in. 

These ideals are always equal to their own radical, and so we need not 
worry about embedded components: 


Lemma 5.1.16. Let J = (g1,.--,95,;2) — U1;---, 22 —2n) C klay,..., 2p]. 
For any choice of g:’s, J = VJ. 


Exercise 5.1.17. Prove this lemma. Use (or prove!) the fact that if J C 
k[a1,...,2n] is a zero dimensional ideal, then the radical of J is 


VI = I+ (hi,..-) hn); 


where h, is the squarefree part of the generator of the ideal JM k[a;]. 
See for example Chapter 2, Section 2.1.2. 


We may use any of these splitting principles to compute the minimal 
primes (and therefore the primary decomposition) of J, since we have many 
zero-divisors around: each 2; is (potentially) a zero-divisor! 


Example 5.1.18. Let R= k[x1,..., 24], where k = Fo. Let F : k* — kt. 

The associated directed graph has 24 = 16 nodes. Let’s find the fixed 
points of one such finite dynamical system, with the aid of Macaulay 2. In 
such a small example, we can compute the fixed points by hand. For larger 
examples, e.g. p = 3, n = 20, this is not so easy! 

i8 : R = ZZ/2[x_1 .. x_4]; 
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i9 : L = ideal(x_1°2 + x_1, x_2°2 + x_2, x_3°2 + x_3, x_4°2 + x_4); 
o9 : Ideal of R 


Our sample finite dynamical system: 


i10 : F = matrix {{x_1*x_2*x_4+x_1+x_4, 
x_1*x_3*x_4+x_2*x_4+x_2, 
x_1*x_3+x_3*x_4+x_3, 
x_1*x_3*x_4+x_1+x_4}} 


010 = | x_1x_2x_4+x_1+x_4 x_1x_3x_4+x_2x_4+x_2 x_1x_3+x_3x_4+x_3 x_1lx_-:- 


1 4 
010 : Matrix R <---R 


Fixed points of F are precisely the zeros of the following ideal. 
i11 : J =L + ideal (vars R - F); 


o11 : Ideal of R 


i12 : transpose gens gb J 


o12 = {-1} | x_i+x_4 | 
{-2} | x_4°2+x_4 | 
{-2} | x_3x_4+x_1 | 
{-2} | x_2x_4+x_3x_4 | 
{-2} | x_3°2+x_3 | 
{-2} | x_2°72+x_2 | 
6 1 


012 : Matrix R <---R 


Although we could solve these equations by hand, we instead blindly follow 
the recursion using indeterminates as (potential) zero divisors. We start with 
ZY. 


013 = ideal (x +1, x +1, x +1, x + 1) 
4 3 2 1 


013 : Ideal of R 
i14 : J2 = ideal gens gb(J + ideal(x_1)) 


2 2 
014 = ideal (x ,x,x +x,x +x) 
4 Zz 3 3 2 2 


014 : Ideal of R 
The intersection of these ideals is J. 

i15 : J == intersect(J1,J2) 

015 = true 
The first ideal is already linear, so its zero set is a point. From the description 
of J2 we could write down the rest of the solutions, but let’s continue. Split 
using 23: 

i116 : J21 = J2 : x_3 
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2 
016 = ideal (x , x +1, x,x +x) 
4 3 1 2 2 


016 : Ideal of R 
i17 : J22 = ideal gens gb(J2 + ideal(x_3)) 


2 
017 = ideal (x ,x,x,x +x) 
4 3 1 2 2 


017 : Ideal of R 


Now we can split each of these using x2, obtaining 5 solutions total. Already, 
one can imagine ways to improve the efficiency of even this small example. 
For larger problems, these improvements can make the difference between 
obtaining an answer and waiting forever! 

We could have computed this directly in Macaulay 2. The decompose rou- 
tine provides the list of minimal primes. The primaryDecomposition routine 
provides an irredundant primary decomposition. 

i118 : C = decompose J; 

Display these ideals: 

i19 : C/(I -> (<< toString I << endl)); 

ideal (x_2,x_3,x_4,x_1) 

ideal (x_2+1,x_3,x_4,x_1) 

ideal (x_2+1,x_3+1,x_4,x_1) 


ideal (x_2,x_3+1,x_4,x_1) 
ideal (x_2+1,x_3+1,x_4+1,x_1+1) 


Exercise 5.1.19. Let R = Fs3[a1,...,229]. Choose F = (fi,..., foo) such 
that each fg is a sum of two randomly chosen quadratic monomials x;27;. Find 
the fixed points of this finite dynamical system. Also find the points of order 
2, ie. those points a such that a = F(F(a)). 

Here is an open question: can you characterize the graphs (of 
which arise from F in this way? 


37° vertices) 


5.2 Lecture 2: Bayesian networks and Markov ideals 


The emerging field of algebraic statistics [PRW00] advocates polynomial al- 
gebra as a tool in the statistical analysis of experiments and discrete data. 
Statistics textbooks define a statistical model as a family of probability dis- 
tributions, and a closer look reveals that these families are often algebraic 
varieties: they are the zeros of some polynomials in the probability simplex 
[(GHKMO01], [SSOO]}. 

We begin by reviewing the general algebraic framework for independence 
models presented in [Stu02, 88]. Let X1,...,X;, be discrete random variables 
where X; takes values in the finite set [d;] = {1,2,...,d;}. We write D = [di] x 
[dg] x --- x [dp] so that C? denotes the complex vector space of n-dimensional 
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tables of format d, x --- x d,. We introduce an indeterminate py, u.--u, Which 
represents the probability of the event X, = ui, X2 = u2,..., Xn = Un. These 
indeterminates generate the ring C[D] of polynomial functions on the space of 
tables C?. We could also use the field R. The points of interest from statistics 
are those in the probability simplex A: the set of points whose coordinates are 
in the interval [0,1], and whose coordinates sum to 1. 

A conditional independence statement has the form 


A is independent of B given C (in symbols: A IL B | C) (5.1) 


where A, B and C are pairwise disjoint subsets of {X1,..., X,}. If C is empty 
then (5.1) means that A is independent of B. 


Example 5.2.1. Let X, be the statement: it will rain today. Let X2 be the 
statement: a puddle will form next to my car door. Let X3 be the statement: 
I will get wet when I step out of the car. These are all binary random variables. 
Given that the puddle has formed, the other two are independent statements: 
X, IL X3 | Xo. 


By [Stu02, Proposition 8.1], the statement (5.1) translates into a set of 
homogeneous quadratic polynomials in C[D], and we write I4\ p\c for the 
ideal generated by these polynomials. The following example gives the basic 
idea and method for finding these ideals. 


Example 5.2.2. Let X 1, X2,X3 be three random variables, with dj = d3 = 2 
and dz = 3. Let’s write down the ideal in k[puyusu3| (12 variables) which 
defines the set of probability distributions which satisfy X, lL X2 | X3. 

A probability distribution satisfies this independence condition if 


Pr(X1 = 1, X2 = u2 | X3 = uz) = 
Pr(X, =u | X3 = us)Pr(X2q = ug | X3 = us), 


for all choices of u; € [d;]. By removing the conditional probabilities, and 
multiplying by Pr(X3 = us), we obtain 


P++u3Puiu2u3 = Puitu3P+u2us3> 


where we have replaced Pr by p, and a “+” means sum over all possible values 
in that variable (i.e. marginalize over that variable). For example, 


Pi+2 = pi12 + Pi22 + P132- 


It is a simple exercise in determinants to show that the ideal generated by 


{P++usPurusus — Pui +u3P+u2us3 | all U1, U2, U3}, 


is the same as the ideal generated by the six 2 by 2 minors of the matrices 
M, and M2, where 
M= & P12. 7 
P214 P22i P23i 


Note that all 12 indeterminates appear, and each matrix has 6 of them. 
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The general case goes the same way: The ideal [41 Bic is generated by 
the 2 by 2 minors of matrices M;, for i = 1..c, where c is the number of 
possible values of C. Each matrix is obtained by making an a x b matrix 
where the (j, k)th entry is the linear polynomial in the p,, ..u,, which represents 
Pr(A=j,B=k,C =i). 

Since the ideal generated by the 2 by 2 minors of a generic matrix of 
indeterminates is prime, we have the following fact (see [Stu02]). 


Proposition 5.2.3. For any choice of A,B, and C, the ideal I,u pic is 
prime. 


The interesting part begins when we have more than one independence 
statement. 


Definition 5.2.4. If M = {Aj,Ao,...,A,} is a set of independence state- 
ments, define 
Im =1a,+-::+Ia,.- 


Example 5.2.5 (The contraction lemma). In statistics, there is a lemma that 
says that any probability distribution which satisfies the two independence 
statements X, IL X2 | X3, and X2 IL X3, also satisfies X2 IL {X1, X3}. 

In this example, we investigate the algebraic analog of this statement. Let 


M = {X, IL X2 | X3, Xp» IL X3}. 


Let’s suppose for now that d, = dz = d3 = 2, i.e. we have three binary random 
variables. The first independence statement translates into two quadratics: 


1 = det a oo = det & ) 


P211 P221 P212 P222. 


The second statement translates into a single determinant: 


o = det ee a 


+21 P+22 


where for example p41. = Pi11 + pais: 

So Im = (¢1, $2, 4). 

If we consider the indeterminates of our polynomial ring to be pij;x and 
Pijk, for 1 > 2 (instead of the p;;x), the ideal I\y is a binomial ideal in C[D], i-e. 
generated by polynomials which are differences of two monomials. Binomial 
ideals enjoy many nice properties. For instance, a reduced Grobner basis, in 
any term order, consist of binomials, and they have primary decompositions 
where each associated prime and primary ideal is binomial. For more details, 
see [ES96]. 

The algebraic analog of the contraction lemma is the primary decomposi- 
tion of this ideal. The ideal [,, has 3 components in its primary decomposition 
(all prime). 
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Im = PLN PoN Ix, up x4,x3}5 


where P; = (p411,P+421, ¢2), and Py = (p412,p+422, 1). This implies that any 
probability distribution which satisfies the two independence statements MW 
also satisfies the statement: X2 IL {X,,X3}. The algebraic picture is more 
complicated: outside of the probability simplex A, these two zero sets differ. 


As a warmup for computing primary decompositions later, try 


Exercise 5.2.6. (a) Show, using Macaulay 2, that this is a primary decom- 
position of Iyy. 

(b) Consider the same M, but now suppose that d, = dz = 2 and d3 = 3. 
Write down the ideal J,, and find a primary decomposition for I,,. Is this 
ideal radical? What if d3 > 4? 


5.2.1 Bayesian networks and associated ideals 


A Bayesian network is an acyclic directed graph G with vertices X1,...,Xn- 
For a given node X;, let pa(X;) denote the set of parents of vertex X; in 
G (a node X;, is a parent of X; if there is a directed edge from X, to X;, and 
let nd(X;) be the set of non-descendants of X;, excluding the parents of X;. 
(A non-descendant of X; is a vertex X,; such that there is no directed path 
from X; to X,. Since the graph is not acyclic, parents are non-descendants). 
The local Markov property on G is the set of independence statements 


local(G) = {X; LL nd(X;) | pa(X;) : i= 1,2,...,n}, 


The global Markov property , global(G), is the set of independence state- 
ments A lL B | C, for any triple A, B, C of subsets of pairwise disjoint vertices 
of G such that A and B are d-separated by C. 

The notion of d— separated (“directed separated” ) is a bit technical. The 
intuition is that the nodes of C block directed paths from nodes of A to nodes 
of B, but the notion is slightly more subtle. Since we don’t really need the 
definition for these lectures, we refer to [GSS] or to [Lau96] for the definition. 

For any Bayesian network G, we have local(G) C global(G). Therefore we 
have inclusions Dgcaia) C Igtobai(@), ANd Velobal(a@) C Viocal(@)- 


Example 5.2.7. Let G be the network on four binary random variables shown 
in 5.1. Download the file markov.m2 from the website®. This file contains code 
for displaying a directed acyclic graph, computing independence conditions 
(given a graph), and for computing the ideals corresponding to these inde- 
pendence conditions. The documentation for the code is contained in the file. 


i20 : load "markov.m2" 


3 http: //www.math.cornell.edu/~mike/bayes/ 
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Fig. 5.1. A Bayesian network on 4 vertices 


The function makeGraph takes as input a list of lists: the zth list is the list of 
direct descendants of the ith node. 

i21 : G = makeGraph {{},{1},{1},{2,3}}; 
The Markov conditions come as a list of triples of sets of integers. Each triple 
represents a single independence statement. 

i22 : LM = localMarkovStmts G; 

i23 : LM/print; 


{Set {1}, Set {4}, Set {2, 3}} 
{Set {3}, Set {2}, Set {4}} 


i24 : GM = globalMarkovStmts G; 
i25 : GM/print; 
{Set {1}, Set {4}, Set {2, 3}} 
{Set {3}, Set {2}, Set {4}} 
Note that for this example, local(G) and global(G) are both the same set: 


{1 1 4 | {2,3},2 I 3 | 4}. 


A polynomial ring with the indeterminates pu, u....u,, is created via: 
i26 : R = markovRing(2,2,2,2); 


i27 : numgens R 


027 = 16 
i28 : gens R 
028 = {p 


»?P »?P »P »P »?P »Poctt 
1.4454 ~A,dG4j2 “4,1,9,4 “4yto2 442,40. ajay a age te 
028 : List 
Our two independence statements translate to the 2 by 2 minors of the fol- 
lowing six matrices (and, since each is only 2 by 2, the ideal is generated by 
six quadrics). 
i29 : M = markovMatrices(R, LM); 
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i30 : M/(m -> (<< m << endl << endl)); 
Ast 51,1) pod4,1,2)' | 
_(2,1,1,1) p_(2,1,1,2) | 


,1,1,1)+p_(2,1,1,1) 


> »2,1,1)+p_(2,2,1,1) | 
»1,2,1)+p_(2,1,2,1) 


»2,2,1)+p_(2,2,2,1) | 


,1,1,2)+p_(2,1,1,2) 
»1,2,2)+p_(2,1,2,2) 


»1,2)+p_(2,2,1,2) | 
»2,2)+p_(2,2,2,2) | 


By changing coordinates as discussed above so that e.g. pii11 refers to p4i11, 
and po111 still refers to p2111), the ideal will be binomial in the new coordinates. 
The function marginMap makes a ring map which will make this change of 
coordinates. 

i31 : F = marginMap(1,R); 

031 : RingMap R <--- R 

i32 : F p_(1,1,1,1) 


032 =p <p 
14,454 2515151 


032 : R 


The routine markovIdeal yields the ideal generated by the 2 by 2 minors of 
the above matrices. After changing coordinates, the ideal is binomial: 


i33 : J = F markovideal(R,LM) ; 
033 : Ideal of R 


i34 : transpose generators J 


034 = {-2} | -p_(1,1,1,2)p_(2,1,1,1)+p_(1,1,1,1)p_(2,1,1,2) | 
{-2} | -p_(1,1,2,2)p_(2,1,2,1)+p_(1,1,2,1)p_(2,1,2,2) | 
{-2} | -p_(1,2,1,2)p_(2,2,1,1)+p_(1,2,1,1)p_(2,2,1,2) | 
{-2} | -p_(1,2,2,2)p_(2,2,2,1)+p_(1,2,2,1)p_(2,2,2,2) | 
{-2} | -p_(1,1,2,1)p_(1,2,1,1)+p_(1,1,1,1)p_(1,2,2,1) | 
{-2} | -p_(1,1,2,2)p_(1,2,1,2)+p_(1,1,1,2)p_(1,2,2,2) | 


6 1 
034 : Matrix R <---R 


The ideal J is minimally generated by 6 binomial quadrics. 


One of the most useful aspects of Bayesian networks is that they provide 
a factorization of the joint probability distribution of the n random variables. 
In this example, note that 
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Pr(X, =W,...,X4 = us) = 
Pr(X4 = ua) X Pr(X3 = ug | X4 = us) X Pr(X2 = ue | X3 = ug, X4 = us) 
x Pr(X1, = uy | Xo = ue, X3 = ug, X4 = us) 

= Pr(X4 = us) x Pr(X3 = ug | X4 = us) X Pr(X2 = u2 | X4 = ua) 
x Pr(X, = uy | Xo = ue, X3 = uz) 


If we set Pr(X4 = 1) := a and Pr(X4 = 2) := 1 —<a, and similarly 
let Pr(X3 = 1 | X4 = k) := dg, let Pr(Xg = 1 | X4 = k) := cy, and 
Pr(X, =1| Xo = j,X3 =k) := djx, then the joint probabilities factor. For 
example, piii1 = @b1¢1d11, Pi1i2 = (1 —a)becedi1, pri21 = a(1 — b1)e1di2, and 
so on. Instead of requiring 15 parameters, such a probability distribution may 
be specified using 10 numbers. This is a small example; when the number of 
vertices is large and the graph is sparse, the savings is dramatic. 

If we denote C[E] := C{a, bi, be, c1, C2, di1,...,d22], we may define a ring 
map 


®: C[D] — CE]. 


In what follows we shall assume that every edge (i,j) of the Bayesian 
network G satisfies 7 > 7. In particular, the node 1 is always a sink and the 
node n is always a source. 

For any integer r € [n] and u; € [d;] as before, we abbreviate the mar- 
ginalization over the first 7 random variables as follows: 


d, dz d,. 


Pte tuppiccun = 5 5 i 5 Pizig-eiptrpis Un * 


ay=1t2=1 teal 


This is a linear form in our polynomial ring C[D]. We denote by p the product 
of all of these linear forms. 

As in the example, given a Bayesian network G, we obtain a factorization 
map: 


®: C[D] — CE]. 


The main theorem, which is the algebraic analog of the factorization for the 
joint probabilities for a Bayesian network is the following: 


Theorem 5.2.8. The prime ideal ker(®) is a minimal primary component 
of both of the ideals Nocaya) and Igioba(a)- More precisely, 


(iocal(G) ip) = (Igiobal(G) ‘p*) = ker(®). (5.2) 


For the precise definition of and a proof, see [GSS]. 
This result suggests many questions, most of them unsolved. For example: 


Problem 5.2.9. Find conditions on G' so that Igiobai(g) is a prime ideal (and 
therefore equal to ker ). 
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Problem 5.2.10. Find the dictionary relating basic operations on directed 
acyclic graphs (e.g. deletion of an edge, or of a node, or contraction of an edge) 
with properties of the primary decomposition of the corresponding ideals. 


Problem 5.2.11. Find the primary decomposition of Jiocaq@) or Of Igiobai(@)- 


Perhaps more manageable is to determine certain features of the primary 
decomposition (e.g. the ideal being radical, or having no embedded compo- 
nents) in terms of the data G, and (di,...,dn). 

In the remainder of these lectures, we will develop the tools needed to 
answer these questions computationally, for small networks G. 


5.3 Lecture 3: Tools for computing primary 
decompositions 


In order to use the techniques we have already considered to make an algo- 
rithm for computing a primary decomposition, we must answer the following 
questions. 


e Question #1: How do we find splitting polynomials or zero divisors to use 
with one of our splitting principles? 
Question #2: How can we detect that an ideal is prime or primary? 
Question #3: Practice shows that the splitting tree is highly redundant. 
How should we fight this problem? 


We will provide answers to these questions. But: keep your mind open. We 
challenge you to find better methods yourself! 


Example 5.3.1. As a running example throughout this lecture, let’s consider 
the simple example which occurred in the contraction lemma in the second 
lecture. This is an ideal generated by 3 quadrics, in 8 indeterminates. Let’s 
rename the indeterminates so that we can avoid indices. 


i35 : R = QQ[a..h]; 


i386 : J = ideal(a*xd-b*c, exh-f*g, a*f-b*e); 
036 : Ideal of R 
Just so we know the answer ahead of time, here is the primary decomposition: 


i37 : (primaryDecomposition J)/print; 

ideal (b, a, f*g - e*h) 

ideal (f, e, b*c - a*d) 

ideal (f*g - exh, d*g - c*h, b*g - a*xh, d*e - c*f, bee - a*f, bec - axd) 
There are three primary components. In many ways, this is too simple an 
example: all of the components have the same dimension (five), and all of the 
primary components are prime, so this is a radical ideal. The example still 
provides a good picture of the different tools and also some of the problems 
which occur. 
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5.3.1 Finding splitting polynomials and zero divisors 


Given an ideal J = (fi,..., f-) C k{a1,..-,@n], how can we find a zero divisor 
g mod J (i.e. anelement g for which J: g 4 J)? One method that often works 
is to examine the generators f; and see if they factor. If so, use a factor as the 
zero divisor g. Often no f; will factor. In this case, one may start computing a 
Grobner basis, and examine each new Groébner basis element g;. If g; factors, 
use this factorization to split the ideal (This is the basic description of what is 
known as the factorizing Grébner basis algorithm.) The exact details of how 
best to use this are not clear, and vary with the problem domain. There is 
definitely room for improvement here in existing algorithms! 

Suppose that you cannot find a factor with one of these methods, or, 
perhaps, are unwilling or unable to look there for zero divisors? What then? 
Our answer is obtained by analyzing projection maps. 


5.3.2 Projections and elimination of variables 
Let R= k[a] = k[ay,...,a@n], where k is a field. Choose a subset of variables 
t= {t52.:, ta} Co = {aie tn} 


and let u = x \t. The inclusion &[t] C k[u,t] = k[a] corresponds geo- 
metrically to the projection map k” —> k® defined by sending a point 
(u,t) = (u1,...,Un—a,t1,...,ta) to t € k?. The map of rings ¢ : kit] —> 
k[u,t]|/J corresponds to the projection map 7: V(J) C k™ —> k@, and 
the map of rings &[t]/J, << k[u,t]/J corresponds to the projection map 
mam: V(J) —> V(sJi) = r(V(J)), where J; = ker(¢). If J is not a radical 
ideal, or & is not an algebraically closed field such as C, then this correspon- 
dence between the algebra and geometry needs to be defined more carefully: 
this is where schemes enter the algebraic geometry picture. For us though, we 
will think geometrically, but work algebraically, and so we won’t be concerned 
with these subtleties. 

Recall that we can compute L = ker(¢) by using Grobner bases. A term 
order on ka] = k[u, t] is called an elimination order (eliminating w) if in(f) € 
kt] implies that f € k[E]. 


Proposition 5.3.2. If > is an elimination order eliminating u, and J C 
klu, t] is an ideal, with Grobner basis {fi,...,fr,hi,...,hs}, where h; © k[t], 
but each f; € kt], then {hi,...,hs} is a Grobner basis (and therefore a gen- 
erating set) of Jy = JO kK{E]. 


For the purpose of analyzing projection maps, the product order u >> 
t is a good choice (this is sometimes called a block order): u*t® > uct? if 
Uw” Sgronten tl, Or Ue = 0 and 1” Species C. 


Example 5.3.8. Continuing Example 5.3.1, suppose that t = {a,b,c,d} and 
u= fe, f,g,h}. 
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i38 : R1 = QQle,f,g,h,a,b,c,d, Monomial0rder=>ProductOrder{4,4}] ; 
i39 : L = substitute(J,R1i) 

039 = ideal (- b*c + a*d, - f*g + exh, - exb + fxa) 

039 : Ideal of R1 

i40 : transpose gens gb L 


040 = {-2} | be-ad_ | 
{-2} | eb-fa 
{-2} | fg-eh_ | 
{-3} | ead-fac | 


4 1 
040 : Matrix Ri <--- Ri 
So JM kia, b,c, d] = (bc — ad), since this is the only element whose lead term 
bc is in the subring k[a, b, c,d]. This whole process can be accomplished more 
easily using the Macaulay 2 “eliminate” package. 


141 : load "eliminate.m2" 


This next command ensures that e, f,g,h refer to elements of the ring R. 
i42 : use R; 


i43 : eliminate(J,fe,f,g,h}) 
043 = ideal(b*c - a*d) 


043 : Ideal of R 


5.3.3 Tool: Birational projections 


Suppose that J contains an element f which is linear in a variable, say, 71. 

Write f = gx + h, where g,h don’t involve x1. If g is a non-zero divisor on 

J, then the projection map k[t]/J; —> k[t,xi]/J is birational (where t = 

{@2,...%p} and J, = ker(k[t] — k[a1,t]/J). Geometrically, this means that 

for almost all points p of V(J,) C k"~1, there is a unique point (p,,p) € V(J) 
h(p) 


which maps to it. If g(p) 4 0, then this value is py = err 


Birational maps are well-behaved with respect to primary decompositions: 
Proposition 5.3.4. Let J C k[a1,...,2n] be an ideal, containing a polyno- 
mial f = gx, +h, with g,h not involving x1, and g a non-zero divisor modulo 
J. Let J, = IN k[x2,...,%n] be the elimination ideal. Then 

(a) J = ((A, gai +h): 9”), 

(b) J is prime if and only if J, is prime. 

(c) J is primary if and only if Jy is primary. 

(d) Any irredundant primary decomposition of J, lifts to an irredundant 
primary decomposition of J. 


This tool may often be used to prove that an ideal is prime (if it is!), and 
can sometimes simplify the work to look for zero divisors. However, caution is 
required: the resulting ideal J;, although it is an ideal in one fewer variable, 
can sometimes be much more complicated than J. 
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Exercise 5.3.5. Prove this proposition. (There are at least two related meth- 
ods to do this: Use pseudo-division by gx, + h; or use localization by powers 
of g). 
Example 5.8.6. Continuing Example 5.3.1, all variables occur linearly, and so 
we may choose any one we wish, e.g. a. The corresponding coefficient is d. 
i44 : use R; 
In this example, d is not a zero divisor: 
1445: J:ds=J 
045 = true 


As above, we use the Macaulay 2 “eliminate” package for eliminating variables. 


i46 : I1 = eliminate(J,a) 
046 = ideal (f*g - exh, b*d*e - b*cxf) 
046 : Ideal of R 
The variable f occurs linearly, with coefficient g. It so happens that g is also 
a non-zero-divisor: 
147 : Ii : g ==T11 
047 = true 
So J; is birational to 
148 : I2 = eliminate(Ii,f) 
048 = ideal(b*d*e*g - b*c*e*h) 
048 : Ideal of R 
This single element has three factors: 


i49 : time factor I2_0 
-- used 0.01 seconds 


049 = (b)(- d*g + c*h) (e) (-1) 
049 : Product 


The original ideal J is birational to J. Another way to factor this is to find 
the primary decomposition of I»! 


i50 : time primaryDecomposition 12 
-- used 0.43 seconds 


050 = {ideal e, ideal b, ideal(d*g - c*h)} 
050 : List 
Therefore, the original ideal has three components, all prime. 


Exercise 5.3.7. Use this factorization, and ideal quotients, to produce the 
three primary components. 
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5.3.4 Tool: The flattener of a projection 


One method to find a splitting polynomial is to compute the flattener of 
a projection. We develop this method now. This method has many other 
applications, some of which which we will see later. 

Let J C kiai,...,2n] be an ideal. A subset of variables t = {2;,,...,2i,} 
is called a maximal independent set of J if JN k[t] = (0) and t has maximal 
cardinality over all such subsets with this property. 


Proposition 5.3.8. Let in(J) be the initial monomial ideal of J C k{a] with 
respect to some arbitrary term order. Then every maximal independent set of 
in(J) is also a maximal independent set of J. 


The cardinality d of a maximal independent set of J is called the dimension 
of J. 

Geometrically, if J Nk[t] = (0), the map V(J) —> k@ is dominant, i.e. the 
closure of the image is all of k*. In this case, every component of J which also 
maps dominantly to k? must have the same dimension d as J. A component of 
J which maps into a subvariety of k@ (algebraically: a primary ideal Q of J for 
which QNk[t] 4 (0)) can either have dimension d, or have smaller dimension. 

Suppose that ¢ C x is a maximal independent set for in(J) and therefore 
for J, let u= x \t, and let > be the product order u >> t defined above. Let 
{g1,---, gr} be a reduced Grébner basis for the ideal J, where 


gi = 0; (t)u“* + lower terms in the u variables. 


Since JO kit] = (0), each of the monomials uA: #4 1. Define iny(J) = 
(uAr,...,u4r) C k[u. 

Let h € k[t] be any non-zero element such that for each minimal generator 
u4 of the monomial ideal in,,(J), there is an element g of J, such that g = 
h(t)u4 + lower terms in u. For example, we could take h = lem{ay,...,a,} € 
k[t]. Any such element h is called a flattener for J with respect to t. 

The reason that h is called a flattener comes from commutative algebra. 
One can prove that the inclusion of localized rings k[t], C klu,t|],/J is a 
flat extension. Caution though: our element h enjoys more properties than an 
arbitrary element that satisfies this flatness. 

The key properties of a flattener, for our purposes, is the following obser- 
vations. 


Proposition 5.3.9. If h € k[t] is a flattener for J with respect to t, and if P 
is an associated prime ideal of J, then h € P if and only if PO kit] 4 (0). 


Since a component P of J which satisfies PM k|t] = (0) must have dimen- 
sion at least the cardinality of t, this implies: 


Corollary 5.3.10. If h € k[t] is a flattener for J with respect to t, then 
(J: h°&) is equidimensional of dimension d, and in particular has no embedded 
components. 
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So, either h is a splitting polynomial, or J is equidimensional. In the first 
case, we may split J. We will discuss the second situation later. 


Example 5.8.11. Let’s use the flattener method to compute the primary de- 
composition of the ideal of Example 5.3.1. Even though this is a simple ex- 
ample, it highlights several possible efficiency problems. 

First, we find a maximal independent set of J. (The Macaulay 2 routine 
independentSets returns the maximal independent sets of the initial mono- 
mial ideal of J. Each monomial represents one independent set. For example, 
the first set found is t = {a, b,d, f, h}). 

i51 : independentSets J 

051 = {axb*d*f*h, a*c*d*f*h, axc*xe*xf*h, c#d*exf*h, a*b¥d*g*h, a*xced*g* ++ ° 

o51 : List 
We find 8 maximal independent sets. 

i652 : R1 = QQ[c,e,g, a,b,d,f,h,Monomial0rder=>ProductOrder{3,5}]; 

i53 : L = substitute(J,R1) 


053 


ideal (- c*b + a*d, exh - g*f, - e*b + axf) 
053 : Ideal of R1 
i54 : gens gb L 


054 


| eh-gf eb-af cb-ad gbf-afh caf-ead | 


1 5 
054 : Matrix Ri <--- R1 


By examining the lead terms and coefficients, we see that in,(J) = (c,e,9), 
and that the lead coefficients of c are af and b, the lead coefficients of e are 
b and h, and the lead coefficient of g is bf. Therefore abf is a flattener. Let 
F =abf. A better choice for a flattener would be bf. We choose abf instead to 
show some of the complexities which arise when you choose a flattener which 
is not the simplest. As an exercise, you should do the same computation here 
with the flattener bf. 


i55 : use R 

0565 = R 

055 : PolynomialRing 

i56 : Ji = saturate(J,a*b*f) 

056 = ideal (f*g - exh, d*g - c#h, b*g - akxh, d*e - c¥f, bee - a¥f, b¥ «+: 
056 : Ideal of R 

id7 : Jt == J: (axb*f) 


o57 = true 


So J= Jy ial Ja, where 
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i58 : J2 = trim(J + ideal(a*b*f)) 


° 

o 

loo} 
i] 


ideal (f*g - exh, b*e - a*f, bec - axd, a*b*f) 

058 : Ideal of R 

i59 : J == intersect(J1i,J2) 

o59 = true 
As it turns out, J; is a prime ideal. How can we see this? Since the initial 
ideal iny(J1) = (c,e,g), this means that the projection map is birational, and 
therefore the ideal J; is prime and even more, is rational. 

i60 : Q1 = J1; 

060 : Ideal of R 
Now let’s decompose Jo. 

i61 : independentSets J2 

061 = {cxd*e*f*h, axb*d*g*h, axckd*g*h, cxd*exg*xh} 

o61 : List 
We'll use the first one. 

i62 : R1 = QQla,b,g, c,d,e,f,h,Monomial0rder=>ProductOrder{3,5}]; 


i63 : L = substitute(J2,R1) 


063 ideal (g*f - exh, - a*f + b*e, - axd + b*c, axb*f) 
063 : Ideal of R1 
i64 : gens gb L 


064 


| gf-eh af-be ad-bc bde-bcf bge-aeh b2e b2cf abeh a2eh2 | 


1 9 
064 : Matrix Ri <--- R1 


In this case iny,(J2) = (a,b,g) (So the saturation will again be rational and 
prime, as before). One choice for a flattener is f(de — cf). 
i65 : use R 


065 =R 

065 : PolynomialRing 

i66 : Q2 = saturate(J2,£*(d*e-c*f) ) 

066 = ideal (b, a, f£*g - e*h) 

066 : Ideal of R 

i67 : J3 = trim(J2 + ideal (f£*(d*e-c*f))) 


2 
o67 = ideal (f*g - exh, bee - a*f, bec - axd, dxexf - c#f , axb*f) 


067 : Ideal of R 
i68 : J == intersect (Q1,Q2,J3) 


068 = true 
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One more time. Let’s decompose J3. 
i69 : independentSets J3 


069 = {axb*d*g*h, a*c¥*d*xg*xh} 

069 : List 

i70 : R1 = QQ[c,e,f, a,b,d,g,h, Monomial0rder=>ProductOrder{3,5}] 
070 = R1 

070 : PolynomialRing 

i71 : L = substitute(J3,R1) 


2 
o71 = ideal (- e*h + f*g, exb - f*a, c#b - axd, - cxf + exf*d, f*axb) 


o71 : Ideal of R1 
i72 : transpose gens gb L 


072 = {-2} | eh-fg 
{-2} | eb-fa 
{-2} | cb-ad 
{-3} | fbg-fah 
{-3} | cfa-ead 


{-3} | fab 
{-3} | cf2-efd 
{-4} | fa2h 
{-4} £2a2 
{-4} | fa2d 
{-5} | ea2d2 
11 1 


o72 : Matrix R1 <--- R1 


This time, in,(J3) = (c,e,f), and so once again the saturation will be a 
prime rational ideal. A flattener that works this time is ab. Notice that there 
are other choices for flatteners, but the others are more complicated and would 
add extra work. 

i73 : use R 


o73 = R 

073 : PolynomialRing 

i74 : Q3 = saturate(J3,a*b) 
074 = ideal (f, e, b*c - ax*d) 
074 : Ideal of R 

i75 : Q3 == J3 : (a*b) 

075 = false 

i76 : Q3 == J3 : (a*b)72 


o76 = true 


This time, 
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i77 : J4 = trim(J3 + ideal(a*2*b*2)) 


2 22 
o77 = ideal (f*g - exh, bee - a*f, bec - axd, dxe*xf - c#f , axb*f, a b ) 


o77 : Ideal of R 

But notice that 
i78 : J == intersect (Q1,Q2,Q3) 
078 = true 


Therefore, we may avoid the primary decomposition of J4, since it will only 
consist of redundant terms. You should check, but the primary decomposition 
of J, has seven primary ideals, and J, is not a radical ideal. 


Exercise 5.3.12. Apply this technique to other Bayesian network examples, 
such as the example from the contraction lemma. Consider the cases when 
d; > 2 for a nice challenge. 


5.3.5 Primary decomposition of equidimensional ideals 


Here is the situation: Suppose that J C k[a1,...,2n] = k[u,t], where t is 
a maximal independent set, as above, and that h € k/t] is a flattener, and 
J:h* = J. How can we tell if J is prime, or primary? And, if not, how do 
we find a primary decomposition of J? 

In the previous example, we used the following fact, which we leave (as 
usual!) for you to prove as an exercise. 


Proposition 5.3.13. Suppose that t is a maximal independent set for J, u= 
x\t, and h is a flattener for J with respect to t. If iny(J) is generated by the 
set of indeterminates t, then (J: h®) is a prime ideal, and is also rational. 


With certain kinds of ideals, such as Markov ideals, this happens quite 
frequently, as we saw in the previous example. If not, what do we do then? 
Once again, the flattener comes to the rescue. Algebraically, the flattener h 
allows us to compute the “generic fiber”: 


Proposition 5.3.14. If h € k[t] is a flattener as defined above, then 
(J: h°) = Jk(t)[u] ON ku, ¢]. 


But notice! Since ¢t is a maximal independent set, J k(t)[u] is a zero dimen- 
sional ideal of k(t)[u]. This means that if we can find the primary decomposi- 
tion of zero dimensional ideals, then we can compute a primary decomposition 
of the equidimensional ideal J. 


Proposition 5.3.15. If Jk(t)[u] N klu,t] = J, and if Or ical Qs is an 
irredundant primary decomposition of Jk(t)[u], and if Q; = Qi klu, t], then 
J=Q19:-:-NQ, ts an irredundant primary decomposition of J. 
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This is great! It allows us to use the results from David Cox’s lectures 
(Chapter 2) on computing the primary decomposition of zero dimensional 
ideals. These techniques tell us in particular that, if Jk(£)[u] is prime or pri- 
mary, then J will have the same property. 


Exercise 5.3.16. Refer back to Chapter 2, and write an algorithm for com- 
puting the primary decomposition of an equidimensional ideal. Apply your 
algorithm to the equidimensional ideal J = (ad? + bde + ce”, ad? + bdf + 
cf?, ae? + bef + cf?) C Qja,..., fl. 


5.4 Lecture 4: Putting it all together 


In this lecture we apply all of our techniques and present a relatively com- 
plete algorithm. Keep in mind though that if you have a difficult ideal whose 
decomposition you desire, canned algorithms often will not finish. Using the 
techniques we have discussed may make it possible to find its decomposition, 
by applying them in novel ways, or using some extra information you have 
regarding your ideal. 


5.4.1 Useful subroutines 


Some of the techniques that we have considered so far can be summarized by 
the following routines. 


saturation(J,f) , returns the pair (@,J: f°), where (J: f°) =(J: f). 


independentSet (J) , returns a maximal independent set for the ideal J. 


We did not discuss an algorithm for this, but it is a good exercise to find 
one. See also [DGP99]. 


flattener(J,t) , given a maximal independent set t of J, returns the pair 
(h,iny(J)), where h is a flattener of J with respect to t, u = x\t, and in,(J) 
are all as in the last lecture. 


equidimensionalPD(J, t, h), where t is a maximal independent set of J, h 
is a flattener, and J: h = J. This routine returns a list of pairs {P,Q} such 
that Q is P-primary, and the intersection of all of the Q is J. Note that the 
P are all of the same dimension, and there are no embedded primes. 


As discussed later, in order to handle redundancy, we instead use the 
following variant. 


equidimensionalPD(J, t, h, L) , where t is a maximal independent set of 
J, his a flattener, and J: h = J, and L is an ideal. This routine returns 
a pair (PQ, L’), where PQ is a list of pairs {P,Q} as above except only the 
pairs (P,Q) with L ¢ P are returned, and L’ is the intersection of L with all 
of the Q’s. 
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5.4.2 Fighting redundancy 


Suppose that we have an algorithm for finding a splitting polynomial for an 
ideal J, if one exists. Call these routines thereIsASplittingPolynomial and 
splittingPolynomial. Recall that if no splitting polynomial exists, then J 
is primary. 

Here is the most naive method based on splitting principles: 

PDsplit = (J) -> ¢ 
-- input: an ideal J 
-- output: a list of primary ideals whose intersection 
= is a primary decomposition of J 
if thereIsASplittingPolynomial (J) 
then ( 
f := splittingPolynomial (J); 
(d,J1) := saturation(J,f); 
J2 := J + ideal(£7d); 
return join(PDsplit(J1), PDsplit(J2)) -- join the 2 lists 
) 
else 
-- J is already primary 
-- So, return a list with one element. 
return {J} 
)3 

This algorithm is written in Macaulay 2 form for convenience, but in or- 
der to run, routines thereIsASplittingPolynomial, splittingPolynomial 
must be provided. 

There are several problems with this algorithm. One difficulty is that the 
algorithm does not return an irredundant primary decomposition. It is far 
worse: it computes the primary decomposition for a potentially large number 
of useless redundant ideals. 

We may think of this computation as a binary tree, where J is the root, 
J, the left node, and Jz the right node. By choosing splitting polynomials 
for each leaf, we continue to build a larger and larger binary tree, until no 
splitting polynomials can be found, and then each leaf is a primary ideal. 

At any time during the construction of this tree, the intersection of all of 
the leaf ideals is equal to J. The problem of redundancy is: many leaves or 
whole subtrees are not needed. How can we detect this? 

There are two simple methods that help. 


Method #1 


If we don’t mind computing extra ideal quotients, then we can remove many 
redundant components from this tree. 


Lemma 5.4.1. If (J: f°) = J and (J: 9) = (J: g*), then 


T= (J: g9*)N (T+ (9°): f*). 


The proof is almost identical to the proof of Lemma 5.1.14. 
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Method #2 


The second trick is to process this tree from left to right, and keep the inter- 
section DL of all primary components found so far. Ignore a node J, if LC J,. 


Exercise 5.4.2. Convince yourself that both of these methods are valid. 


5.4.3 The overall algorithm 


We now describe one way to put these techniques together into an algorithm. 
This is a version of the Gianni-Trager-Zacharias (GTZ) algorithm, which often 
works quite well. 

Before we present the algorithm, you should spend some time working 
on the following exercise. If you can solve it quickly, then find other, better, 
solutions! 


Exercise 5.4.3. Write an algorithm which computes a primary decomposi- 
tion of an ideal J C k[x] = k[ai,...,2n]. You may use any of the techniques 
presented so far, and any of the subroutines saturation, independentSet, 
etc. 

Try to implement your solution, and try it on several examples, includ- 
ing some Markov ideals, as in the second lecture. Some questions you should 
consider are: (1) How to process and compute with as few redundant compo- 
nents as possible? (2) The precise element which one splits the ideal by has 
a dramatic effect on the complexity of the computation. Is there any way to 
control this? 


The GTZ algorithm, as we have it here, splits the ideal J by using a 
flattener. It is relatively easy to compute the primary decomposition of the 
left hand side (ideal J1 below), since this ideal is equidimensional. We have 
indicated that one way to do this is to use multiplication maps, as in Chapter 2. 
Another way is to use a change of coordinates to bring the ideal into a nice 
position. See [GTZ88] and [DGP99] for more details. The right hand side 
(ideal J2 in the algorithm) causes more problems, since by adding f¢@ to the 
ideal, the number of components can become quite large. This is the reason 
for using the redundancy control method #2 above. The ideal L which is an 
argument to GTZ and equidimensionalPD implements this method. 

The final algorithm is here: 

PD = (J) -> ( 
-- input: J is an ideal 
-- output: a list of pairs {P,Q} such that Q is P-primary 
== and the Q form an irredundant primary decomposition 
== of J. 
L := ideal(1_(ring(J))); 
(PQ,L) = GTZ(J,L); 
-- at this point, L should equal J. 
PQ 
)3 
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GTZ = (J, L) -> ¢ 
-- input: J is an ideal. 
-- L is an ideal, which is the intersection of all 
os primary components found so far. 
-- output: a pair (PQ, L’), where 
-- PQ is a list of {P,Q}’s, where P is prime, Q is 
= primary to P 
== L’? is the intersection of L with all of the Q’s in PQ. 
-- The set of Q’s in PQ form that part of the primary 
-- decomposition of J with primary ideals not containing L. 
if isSubset(L,J) then return ({}, L); 
t := independentSet (J); 
(f,inJ) := flattener(J,t); 
(d,J1) := saturation(J,f); 
if degree inJ == 
then ( 
-- Ji is prime 
PQi = {J1,J1}; 
L = intersect(L, J1); 
) 
else 
-- This also replaces the L with the intersection 
(PQ1,L) = equidimensionalPD(J1,u,L); 
if d == 0 then return (PQ1i,L); 
J2 := J + ideal(f"d); 
(PQ2,L) = GTZ(J2,L); 
(join(PQ1,PQ2), L) 


5.4.4 A harder example: the primary decomposition of a Markov 
ideal 


Exercise 5.4.4. Consider the graph G with 5 vertices, and directed edges 5 — 
4,5 + 3,4 — 2,3 — 2,2 — 1 (see Figure 5.2). Find a primary decomposition 
of the global Markov ideal of this graph, in the case when the five random 
variables are all binary (i.e. dj =... =ds5 = 2). 

Before looking at the answer below, try this on your own. You should use 
the marginMap trick or something similar to make the resulting ideal as simple 
as possible. 


We now present the answer to this exercise using Macaulay 2. We do 
not use canned algorithms, although we do use routines independentSet, 
saturationO, and flattener. saturation0 is the same as in the description 
of saturation, except only the ideal is returned. As of this writing, no system 
can do these primary decompositions in a small amount of time using canned 
algorithms. However, by the time you read this, specific computations might 
be much faster. So you should read this solution with a skeptical eye: we have 
chosen a technique that seems to work faster at the moment. Your solution 
might be different, and possibly more elegant, or more efficient in use of time 
and computer memory. 

First, let’s generate the ideal, and see what we are up against. You can 
find this next file at the same web location as markov.m2. 


232 M. Stillman 


Fig. 5.2. The Bayesian network in Exercise 5.4.4 


182 : load "cimpa-tools.m2"; 
i83 : G = makeGraph {{},{1},{2},{2},{3,4}}; 
i84 : GM = globalMarkovStmts G; 


i85 : GM/print; 

{Set {1}, Set {4, 5, 3}, Set {2}} 
{Set {1}, Set {4, 5}, Set {2, 3}} 
{Set {5, 3}, Set {1}, Set {4, 2}} 
{Set {5}, Set {1}, Set {4, 2, 3}} 
{Set {1}, Set {4, 3}, Set {5, 2}} 
{Set {1}, Set {4}, Set {5, 2, 3}} 
{Set {3}, Set {1}, Set {4, 5, 2}} 
{Set {5}, Set {2}, Set {4, 1, 3}} 
{Set {5}, Set {1, 2}, Set {4, 3}} 
{Set {3}, Set {4}, Set {5}} 


These global Markov statements are all consequences of X, IL {X3, X4, X5} | 
Xo, X5 iil. {X1, X2} | {X3, X4}, and X3 It X4 | Xs. 

i86 : R = markovRing(2,2,2,2,2); 
As mentioned previously, it is often better to change coordinates using the 
following margin map. 

i87 : F = marginMap(1,R); 

087 : RingMap R <--- R 

i88 : J = trim F markovideal(R,GM); 

088 : Ideal of R 

i89 : betti J 

089 = generators: total: 1 74 


OMA os 
1: . 74 
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The ideal J is minimally generated by 74 homogeneous quadrics, 72 of which 
are binomials, and the last 2 have 28 monomials each. These last two gen- 
erators increase the difficulty of the computation, especially for canned algo- 
rithms. This ideal has codimension 20 and degree 2240: 


i90 : (codim J, degree J) 

090 = (20, 2240) 

090 : Sequence 
Our plan is to determine a primary decomposition of the ideal J. We could 
start our computation in a number of ways. We choose to use the GTZ method, 
at least for this first step. 

i91 : u = independentSet J; 

i192 : (inJ,h) = flattener(J,u); 

i193 : degree inJ 

093 = 1 
Since the degree is one, the projection map is birational, and so the ideal JO, 
which we will compute now, is a rational prime. 

i94 : JO = saturation0O(J,h); 

094 : Ideal of R 

i95 : betti JO 


095 = generators: total: 1 175 
O: 2 


an 
Di a 
3 . 


i96 : (codim JO,degree JO) 


096 (20, 1496) 


096 : Sequence 
The prime component JO of J coincides with the canonical component coming 
from the factorization theorem. This ideal is generated by J, together with 
101 quartics. 

We would like to perform the ideal quotient (J : JO), but JO has 101 
quartics, and so the computation takes some time. Often it is the case that 
such an ideal quotient is equal to (J : £), for some f. This happens with the 
first element we try: 

i97 : Jrest = J : JO_74; 

097 : Ideal of R 


i198 : J == intersect(J0O,Jrest) 


098 = true 
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Since these are equal, and since we have computed Jrest as an ideal quotient, 
an irredundant primary decomposition of Jrest will give an irredundant pri- 
mary decomposition of J. As an exercise, you should try to compute the 
primary decomposition of (J, h). It has many components which are not 
needed. 

The ideal Jrest is generated by J and the following 8 quadrics. 


i99 : L = ideal(p_(1,1,1,1,2)*p_(1,1,2,2,2)+p_(1,1,2,2,2)*p_(1,2,1,1,2) 
+p_(1,1,1,1,2)*p_(1,2,2,2,2)+p_(1,2,1,1,2)*p_(1,2,2,2,2), 
p_(1,1,1,1,1)*p_(1,1,2,2,2)+p_(1,1,2,2,2)*p_(1,2,1,1,1) 
+p_(1,1,1,1,1)*p_(1,2,2,2,2)+p_(1,2,1,1,1)*p_(1,2,2,2,2), 
p_(1,1,1,1,2)*p_(1,1,2,2,1)+p_(1,1,2,2,1)*p_(1,2,1,1,2) 
+p_(1,1,1,1,2)*p_(1,2,2,2,1)+p_(1,2,1,1,2)*p_(1,2,2,2,1), 
p_(1,1,1,1,1)*p_(1,1,2,2,1)+p_(1,1,2,2,1)*p_(1,2,1,1,1) 

,1)*p_(1,2,2,2,1)+p_(1,2,1,1,1)*p_(1,2,2,2,1), 

»2)*p_(1,1,2,2,2)+p_(1,1,2,2,2)*p_(1,2,1,1,2) 


cee 
1,1,1,1 
1,134, : 7 
1,1,1,2,1)*p_(1,1,2,1,2)+p_(1,1,2,1,2)*p_(1,2,1,2,1) 
+p_(1,1,1,2,1)*p_(1,2,2,1,2)+p_(1,2,1,2,1)*p_(1,2,2,1,2), 
1,1,1,2,2)*p_(1,1,2,1,1)+p_(1,1,2,1,1)*p_(1,2,1,2,2) 
1 439 
1,1,1,1 
4,154.52 


099 : Ideal of R 

1100 : Jrest == J+L 

0100 = true 

The next step is to decompose Jrest. We proceed in the same manner. 
i101 : u = independentSet Jrest; 

i102 : (inJrest,h) = flattener(Jrest,u); 

i103 : degree inJrest 

0103 = 1 


Once again, the component J1 which we now compute is a rational prime 
ideal, since the projection is birational. 
i104 : factors h 


0104 = {p +p > ?P +p » P ’ 
1,1,2,2,2 1,2,2,2,2 1,1,2,1,2 1,2,2,1,2 1,2,2,2,2 


0104 : List 
i105 : Ji = saturationO(Jrest,h); 
0105 : Ideal of R 


The ideal J1 has the following generators, in addition to those of J or 
Jrest. 


i106 : M = ideal( 


a 
,1,1)+p_(1, 
2 


»2)*p_(2, p_(2,1,1,1,2)*p_(2,2,1,2,2), 
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0106 : Ideal of R 

i107 : Ji == J+M 

0107 = true 

i108 : Ji == Jrest +_M 

0108 = true 

i109 : betti J1 

0109 = generators: total: 1 74 
Oo: 1 4 
1: . 70 

i110 : (codim Ji, degree J1) 

0110 = (20, 170) 

0110 : Sequence 


We could continue in the same manner, taking ideal quotients, hoping that 
the ideals still intersect to give Jrest. However, we can use symmetry to find 
three more prime ideals. One symmetry that is evident from the graph is to 
interchange random variables 3 and 4: 


i111 : s = map(R,R,toList apply((1,1,1,1,1)..(2,2,2,2,2), x -> (¢ 
p_(x#0,x#1,x#3,x#2,x#4)))) 


o111 = map(R,R,{p 


> P » P >» P > P cde 
Lyd,d,1,% 1,d,154,2- 1,4,251,1. d152,1,2  <d,1,h.s4% 
o1i11 : RingMap R <--- R 
i112 :sL==L 
0112 = true 


A second symmetry is to interchange the values 1 and 2 of the ith random 
variable. The permutation t changes these values for the 3rd random variable 
(this is the only one that produces new ideals). 


i113 : t = map(R,R,toList apply((1,1,1,1,1)..(2,2,2,2,2), x -> ¢ 
p_(x#0,x#1,3-x#2,x#3,x#4)))) 


0113 = map(R,R,{p > Pp » Pp » Pp » P yoo 
1,1,2,1,1 1,1,2,1,2 1,1,2,2,1 1,1,2,2,2 1,1,1--- 


0113 : RingMap R <--- R 
i114 :t J==J 


0114 = true 
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i115 : t (J+L) == J+L 
0115 = true 


We now produce three new prime ideals from J1. 
i116 : J2 = trim(J + s M) 


0116 = ideal (p +p » Pp +p » Pp aoe 
4,1,2,1,2 15252;1,2 1,4,2,1,1 1;2,2,1,1 1,1,1+*- 


0116 : Ideal of R 
i117 : J3 = trim(J + t M) 


0117 = ideal (p +p > P +p >?P ae 
1,1,2,2,2 1,2,2,2,2 1,1,2,2,1 1,2,2,2,1 1,1,2°:: 


0117 : Ideal of R 
i118 : J4 = trim(J + s (t M)) 


0118 = ideal (p +p »?P +p >» Pp —* 
1,1,2,2,2 1,2,2,2,2 1,1,2,2,1 1,2,2,2,1 1,1,1--- 


0118 : Ideal of R 


We could attempt to produce other ideals, but we just obtain ones we have 


already seen. 


i119 : J4==J+t (s (t M)) 
0119 = true 


The following ideal quotients remove these components, leaving Jrest65. 
i120 : Jrest2 = Jrest : M; 


0120 : Ideal of R 

i121 : Jrest3 = Jrest2 : (s M); 

0121 : Ideal of R 

i122 : Jrest4 = Jrest3 : (t M); 

0122 : Ideal of R 

i123 : Jrest5 = Jrest4 : (s (t M)); 

0123 : Ideal of R 

i124 : Jrest == intersect(J1,J2,J3,J4,Jrest5) 


0124 = true 


The intersection is still correct. If there were embedded components, this 
approach would still find the associated primes, but would not produce the 
primary decomposition. 


So, all we need to do is decompose Jrest5. It is easy to identify the ideal 


after viewing it. 


i125 : (codim Jrest5,degree Jrest5) 
0125 = (20, 64) 


0125 : Sequence 
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i126 : P = ideal(R_O .. R_15) 


0126 = ideal (p >» Pp » P » Pp > P eo 
Vth: tt eto Aya Ded a Oo) A Od wes 


0126 : Ideal of R 
i127 : Jrest5 == Jrest + P°3 
0127 = true 


i128 : transpose gens trim(J+L+P) 


0128 = {-1} | p_(1,2,2,2,2) | 
{-1} | p_(1,2,2,2,1) | 
{-1} | p_(1,2,2,1,2) | 
{-1} | p_(1,2,2,1,1) | 
{-1} | p_(1,2,1,2,2) | 
{-1} | p_(1,2,1,2,1) | 
{-1} | p_(1,2,1,1,2) | 
{-1} | p_(1,2,1,1,1) | 
{-1} | p_(1,1,2,2,2) | 
{-1} | p_(1,1,2,2,1) | 
{-1} | p_(1,1,2,1,2) | 
{-1} | p_(1,1,2,1,1) | 
{-1} | p_(1,1,1,2,2) | 
{-1} | p_(1,1,1,2,1) | 
{-1} | p_(1,1,1,1,2) | 
{-1} | p_(1,1,1,1,1) | 
{-2} | p_(2,1,2,2,2)p_(2,2,2,2,1)-p_(2,1,2,2,1)p_(2,2,2,2,2) | 
{-2} | p_(2,1,2,1,2)p_(2,2,2,1,1)-p_(2,1,2,1,1)p_(2,2,2,1,2) | 
{-2} | p_(2,1,1,2,2)p_(2,2,1,2,1)-p_(2,1,1,2,1)p_(2,2,1,2,2) | 
{-2} | p_(2,1,1,1,2)p_(2,2,1,1,1)-p_(2,1,1,1,1)p_(2,2,1,1,2) | 

20 1 


0128 : Matrix R <--- R 


In order to show that Jrest5 is a primary component, it suffices to show that 
it is equidimensional, since its radical is a prime ideal. 


i129 : u = independentSet Jrest5; 

i130 : (inJrest5,h) = flattener(Jrest5,u); 
1131 : Jrest5 == saturation0O(Jrest5,h) 
0131 = true 


Therefore, Jrest5 is a primary ideal. 

So finally we have the following primary decomposition of the original ideal 
J. 

i132 : J == intersect(JO, J+M, J+(s M),J+(t M),J+((s(t M))),J+L+P*3) 

0132 = true 
The ideal is equidimensional, something that was not a priori obvious. It is 
also not a radical ideal. 


Exercise 5.4.5. This is a more difficult exercise! Let G be the graph with 
directed edges 5 3,4 3,5 1,4 1,3 2:2 1. Let J be the 
global Markov ideal of G. Show that J has 23 minimal prime ideals, and 17 
embedded prime ideals. 
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5.4.5 Other algorithms, and what to read next 


Good papers to read next include [GTZ88], [SY96], and [DGP99]. 

The algorithm of Shimoyama and Yokoyama [SY96] (the SY algorithm) 
uses the following observation. Suppose that P,,...,P, are the minimal primes 
of J, and that s1,...,5, are separators, ie. 5; € AjziPj, but 5s; ¢ P,. If 
(J: st’) = (J: 8%), then 


Ff 2TtS™ iene Tis eye. 


The SY algorithm uses this equality to split an ideal. Each J; := (J : 59°) has 
radical P;, but possibly has embedded primes too (These are so-called pseudo- 
primary ideals). The algorithm proceeds recursively by using flatteners to split 
these pseudo-primary ideals. This algorithm requires that the minimal primes 
be computed first. One method is to use characteristic or triangular sets . See 
[DGP99] for a description. Another method is to keep splitting VJ, either 
using ideal quotients or a factorizing Grobner basis algorithm. 

All of these algorithms use similar methods, with some novelties. With 
more commutative algebra background, the paper by Eisenbud, Huneke and 
Vasconcelos [EHV92] has very interesting techniques for computing radicals, 
identifying associated primes, and computing primary decompositions (and 
more!). 


5.4.6 Some open problems 


We close with a few open problems. The first challenge is to find better primary 
decomposition algorithms. 

As for Markov ideals, there are many open problems, see [GSS] for the 
ones presented here. You will find other open problems there as well. 


Problem 5.4.6. Find an efficient method for computing the associated primes 
of an ideal, without first computing the entire primary decomposition. 

There is a very nice solution to this problem, in [EHV92], but unfortu- 
nately, there are many practical situations where their algorithm uses too 
much time or memory to be competitive. 


Problem 5.4.7. What is the condition on Bayesian networks G, (d1,...,dn) 
for the ideal Igioba(q@) to be prime? radical? without embedded components? 
Can one characterize the primary decomposition in terms of G and the d;? 


Problem 5.4.8. Prove that every associated prime of Ijocaiig) OF Iglobal(G) i8 
rational. 


Problem 5.4.9. Find the best way of putting the techniques for primary 
decomposition together to handle certain classes of ideals, e.g. 
(1) local and global Markov ideals. 


(2) binomial ideals 
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Problem 5.4.10. Prove that the degree 2 part of the ideal ker(®) (from lec- 
ture 2) is exactly the same as the degree 2 part of Igiobal(G)- 

This is true for binary random variables, with n = 5, and any random 
variables, if n < 4, see [GSS]. 
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Summary. This chapter is intended as a brief survey of the different notions and 
results that arise when we try to compute the algebraic complexity of algorithms 
solving polynomial equation systems. Although it is essentially self-contained, many 
of the definitions, problems and results we deal with also appear in many other 
chapters of this book. We start by considering algorithms which use the dense repre- 
sentation of multivariate polynomials. Some results about the algebraic complexities 
of the effective Nullstellensatz, of quantifier elimination processes over algebraically 
closed fields and of the decomposition of algebraic varieties when considering this 
model are stated. Then, it is shown that these complexities are essentially opti- 
mal in the dense representation model. This is the reason why a change in the 
encoding of polynomials is needed to get better upper bounds for the complexities 
of new algorithms solving the already mentioned tasks. The straight-line program 
representation for multivariate polynomials is defined and briefly discussed. Some 
complexity results for algorithms in the straight-line program representation model 
are mentioned (an effective Nullstellensatz and quantifier elimination procedures, 
for instance). A description of the Newton-Hensel method to approximate roots of 
a system of parametric polynomial equations is made. Finally, we mention some 
new trends to avoid large complexities when trying to solve polynomial equation 
systems. 


6.0 Introduction and basic notation 


The fundamental problem we are going to deal with, as in most other chapters 
of this book, is to solve (over the field of complex numbers C) a system of 
multivariate polynomial equations with coefficients in the field of rational 
numbers Q algorithmically, but our particular point of view is related to the 
question of whether we can predict how long our algorithms will take. Of 
course, we should define what it means to solve such a system. A first possible 
answer would be to decide whether there are any solutions to the given system, 
and, in case there are solutions, to describe them in a ‘useful’ or at least in 
an ‘easy’ way. 
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Many attempts to do this are based on trying to transform our problem 
into a linear algebra one. The reason for this is that we know how to solve 
many linear algebra problems effectively. 

The focus of our attention will be the algorithmic solutions to these prob- 
lems; so, we are going to define what an algorithm is for us (perhaps a rather 
inflexible definition but necessary to meet the requirements of our work). 
Roughly speaking, the less time an algorithm takes to perform a task, the 
better. This will lead to the definition of algebraic complexity, a kind of mea- 
sure for the time an algorithm takes to perform what we want it to. 

One of the problems we have when we deal with multivariate polynomials 
is that the known effective ways to factorize them take a lot of time, so we 
will try not to use this tool within our algorithms. 

In the different sections of this chapter, we are going to state the problems 
that will be taken into account and describe (or just mention, if the description 
is beyond the scope of this survey) some ways of solving them. 

Before we begin considering the problems, we need to fix some notation 
and give some definitions: 

A system of polynomial equations is a system 


fi(ai,-. eget) => 0 
fs(X1,- aie = 0 


where f1,..., fs are polynomials in C[X1,..., X;,] and the solutions considered 
will be vectors (a1,...,%n) € C”. Whenever we want to speak about a group 
of variables or a vector, we often use just a capital or lower case letter with no 
index; for example in this case, we could have written C[X] or x € C”. The set 
V CC” of all the solutions of such a system will be called an algebraic variety 
(or simply a variety if the context is clear). Its dimension is the minimum 
number of generic hyperplanes such that their common intersection with V 
is empty. For example, a point has dimension zero (a generic hyperplane does 
not cut it); a line has dimension one (a generic hyperplane cuts it, but two 
generic hyperplanes do not), etc. For a more precise definition of dimension 
see, for example, [Sha77] or [CLO97]. 

From the algorithmic point of view, we deal exclusively with polynomials 
with coefficients in Q but we still consider all the solutions to our systems in 
Cm. 

Sometimes it will be useful to take into account fields other than Q and 
C. If k is a field, k will denote an algebraic closure of k. 


6.1 Statement of the problems 


In this section we are going to state some of the questions we usually want 
to answer when dealing with systems of polynomial equations. Some of these 
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problems are also mentioned or studied in other chapters of this book, but we 
present them here for the sake of this chapter being self-contained. 


6.1.1 Effective Hilbert’s Nullstellensatz 


Let X = {Xj1,...,Xn} be indeterminates over Q. Given s polynomials 
fi,---,fs © Q[X], if we want to solve the system of polynomial equations 


fi(ai,..-,;2n) = 0 


fal @igees; hq) =O 
the very first question we would like to answer is whether there exists any 
point (a1,...,%) € C” satisfying this system (that is to say, if the equations 
fi =0,..., f, = 0 share a common solution in C”). 

When all the polynomials f;,..., f; have degrees equal to 1, the system 
we are dealing with is a linear system and there is a simple computation of 
ranks of matrices involving the coefficients of the polynomials which answers 
our question: 

Suppose our linear system is given by A.2t = B (with A € Q**” and 
B € Q**!). Then 


zeEC" / Aat=B <> rank(A) =rank(A|B) 


(where (A|B) denotes the matrix we obtain by adding the column B to the 
matrix A). 

The first step towards a generalization of this result when we deal with 
polynomials of any degree (generalization in the sense that it relates the ex- 
istence of solutions to some computations involving the coefficients of the 
polynomials considered) is the following well-known theorem: 


Theorem 6.1.1. (Hilbert’s Nullstellensatz) Let fi,...,fs € Q.X1,..., Xn]. 

Then the following statements are equivalent: 

i) {cz EC” / fiz) =--- = f(x) = 0} = 8. 

it) There exist polynomials g1,..-,;9s € Q(X1,..., Xn] such that 1 = S- 9i-Fi- 
1<i<s 


(See Chapter 4 for other versions of this theorem.) 

A proof of this theorem can be found in almost any basic textbook on al- 
gebraic geometry (see for example {[Har83], [Kun85] or [CLO97]). This result 
was already known by Kronecker and it essentially shows how a geometric 
problem (Is the variety defined as the common zeroes of a fixed set of polyno- 
mials empty?) is equivalent to an algebraic one (Is 1 an element of the ideal 
(fi, a wp dale 


We will call an algorithm an effective Hilbert’s Nullstellensatz, if given 


as input the polynomials f),...,f;, the algorithm computes polynomials 
gi,--+,gs (in case they exist) such that S- gi-fi = 1. 
l<i<s 


Later on, we will mention some effective Hilbert’s Nullstellensatze. 
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6.1.2 Effective equidimensional decomposition 


Supposing we already know that a particular system of polynomial equations 
has solutions, we may need to answer some questions about the geometry of 
the algebraic variety they define in C”: Does it consist only of finitely many 
points? Is there a whole curve of solutions? Are there isolated solutions?, etc. 

All these questions can be answered by means of geometric decomposi- 
tions of the algebraic variety defined by the original system of polynomials. 
These decompositions we are going to define are intimately bound up with 
the primary decomposition of ideals considered in Chapter 2 and Chapter 5 
but they do not coincide because our approach is exclusively geometric while 
these others are purely algebraic. 


Definition 6.1.2. An algebraic variety C C C” is called irreducible if it 
satisfies 


C=C, UC, where C, and C2 are algebraic varieties > C = C, or C=C. 


The following is a classical result from algebraic geometry. It states that 
the affine space C” is a Noetherian topological space when considering the 
Zariski topology (that is, the topology in which the algebraic varieties are the 
closed sets) and its proof can be found, for example, in [Sha77] or [CLO97]. 


Proposition 6.1.3. (Irreducible decomposition) Let V C C” be an algebraic 
variety. Then, there exist unique irreducible varieties C),...,C, such that 


C; ZC; ifpixj and 
v= U G. 
l<i<r 


From our point of view and our definitions, the irreducible decomposition is 
not algorithmically achievable. If this were so, just by considering the case n = 
1, we would be able to find all the roots of any univariate rational polynomial 
(note that the irreducible decomposition of {x € C / [],2;<4(u — ai) = 0} is 
exactly U,<;<4{ai}). This is the reason why we are going to consider a less 
refined decomposition of a variety. 

Let V = Uj, <;<, Ci be the irreducible decomposition of the variety V and, 
for every 0 < j <n, consider the union of all the irreducible components of 


V of dimension 7 


{i /1<i<r 
and dim C;=j} 
It is obvious that V = oeien V; where, for every 0 < j < n, either V; = 0 or 
dim V; = j. This unique decomposition is called the irredundant equidimen- 
sional decomposition (or equidimensional decomposition for short) of V. 
Note that the information given by this decomposition still allows us to 
answer all the questions we asked above. For example, a non-empty variety 
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V consists only of finitely many points if and only if Vo 4 @ and V; = 0 for 
every l1<j<n. 

The equidimensional decomposition has the following property, nice from 
the algorithmic point of view: 


Proposition 6.1.4. Let fi,...,fs € Q|X1,...,Xn] be polynomials and let 
V CC” be the algebraic variety of their common zeroes. If V; is one of the 
components appearing in the irredundant equidimensional decomposition of V,, 
then there exist polynomials in Q[.X1,...,Xn] defining V;. 


The core of this result is that there are rational polynomials defining V;, so 
we have a chance to compute the irredundant equidimensional decomposition 
algorithmically using only rational coefficients. 

We will call an algorithm an effective equidimensional decomposition algo- 
rithm if given an algebraic variety V C C” defined by rational polynomials, 
the algorithm describes the varieties involved in its equidimensional decom- 
position (i.e. its equidimensional components) as separate varieties. 

A final comment has to be made about the irreducible decomposition of 
a variety defined by rational polynomials: we could take into account only 
varieties defined by rational polynomials as closed sets to define the rational 
Zariski topology in C”. If this is the case, the irreducible components of a vari- 
ety will be still definable by rational polynomials. For example, in the case of 
the variety defined by a squarefree polynomial, its rational decomposition will 
essentially coincide with the factorization of the considered polynomial, but as 
we have stated before, we do not want to deal with polynomial factorization, 
and this is why we are not going to consider this problem. (For an algorithm 
yielding this irreducible decomposition numerically, see Chapter 8.) 


6.1.3 Effective quantifier elimination 


Many interesting geometric and algebraic problems can be formulated as first 
order statements over algebraically closed fields and a well-known result from 
logic states that any first order formula in the language of algebraically closed 
fields is equivalent to another formula without quantifiers (see [CK90] for 
details). This is the reason why, in the last decades, special efforts have been 
made to find efficient algorithms to eliminate quantifiers. 

For the sake of simplicity, we will state precisely what elimination of quan- 
tifiers means only in a very particular case: 


Theorem 6.1.5. Let X1,...,Xn,Yi,..., Ym be indeterminates over Q and 
let fi,.--, fs; G1s---592 € QUM1,...,Xn,Ni,---,;¥m] be polynomials. Let 


V:={xEC" /AyeC™ : fil(a,y)=O0 A... A f(a, y) =OA 


A gi(z,y) #0 A... A gelz,y) A Of. 
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Then, there exists a quantifier free formula y involving only polynomials in 
Q[X1,..., Xn], equalities, inequalities and the symbols \ and V such that 


V={xeC" / v(a)}. 
Let us give some simple examples to make this statement clearer. 


Example 6.1.6. Suppose we want to describe the set of all the polynomials of 
degree bounded by d in one variable that have at least a root in C. This set is 


V = {(xo,01,...,2g) EC" /AyeC : vay? + ag_1y* +--+ + 29 =O}. 


Evidently, the Fundamental Theorem of Algebra states that a quantifier-free 
way of defining V is 


V ={(x0,01,...,24) € C4! / op =0V a1 40V 22 40V::: Vag F OF. 


Example 6.1.7. A very well-known example of a quantifier elimination proce- 
dure from linear algebra is the use of the determinant. The set 


Vietapjet™ (agi. wiath est) eo 2 


Lyi t+ + Cinyn = Ciry, +-°° + Linyf, 


(yay -+65Yn) F Yi. 6 Un) AY. } 
Tn1Y1 +++ + 2nn¥n =2n1Y} + +++ + 2anYh, 


is exactly the subset of C”*” defined by the determinant: 
V= { (xij) ecrn / det (x;;) = O}. 


Example 6.1.8. The classical resultant with respect to a single variable Y be- 
tween two polynomials f,, fo € Q[X1,...,Xn][Y] monic in Y and of degree r 
and s respectively is another example of eliminating quantifiers (for the defi- 
nition and basic properties of the classic resultant between two polynomials, 
some of which will be used later, see, for example, [CLO97], [Mig92], [vdW49] 
or [Wal62]}): 


| 


{xeEeC” /AyeC: fi(a,y)=0 A fo(x,y) =O} = 
={xeC" / Resy(fi(z,Y), fo(z,Y)) = Of. 


For a more general definition of resultants as eliminating polynomials see 
Chapter 2 and Chapter 1. 


As before, we will say that we have an efficient quantifier elimination 
procedure if we have an algorithm that, from a formula of the type 


ayeC™ : filai,...,¢n,y) =O A... A fe(¥1,---,2n,y) =OA 


A gi(@1,---;@n,y) ZONA... A gel(ai,.--,2n,y) #9, 
produces a quantifier-free formula y defining the same subset of C”. 
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6.2 Algorithms and complexity 


When we speak about efficiency related to polynomial equation solving, we 
mean the existence of algorithms performing different tasks. But what do we 
call an algorithm? 

The idea of algorithm we deal with is the following: given some data in 
a certain way (numbers, formulae, etc), an algorithm will be a sequential list 
of fixed operations or comparisons that ends in some logical or mathematical 
‘object’ we would like to compute. For example, suppose you want an algo- 
rithm to solve the equation ax = b with coefficients in Q and that you can 
deal with rational numbers algorithmically (that is to say, comparisons and 
operations between rational numbers can be performed somehow). A possible 
algorithm to do this would be the one shown in Figure 6.1. 


e 
Every x There is no The only solution 
is a solution solution is x=b/a 


Fig. 6.1. A possible algorithm to solve the equation ax = b 


Speaking a little more formally, our algorithms are directed acyclic graphs. 
Each node of a graph represents an element of Q, an operation or a comparison 
between two elements of Q. Each ‘incoming’ arrow denotes that the previously 
computed element or condition is needed to perform the following operation. 
Of course, as any graph, our algorithms have only finitely many nodes. A 
further comment has to be said about the graphs being ‘acyclic’. As we want 
to predict how long our algorithms will take to compute some object, we will 
handle a very fixed or restricted family of algorithms: no ‘WHILE’ instruction 
is admitted in our algorithms. We can replace each ‘WHILE’ instruction by a 
‘FOR’, provided we know beforehand how many times we have to repeat the 
procedure involved. So, an instruction of the kind ‘WHILE z > ¢ DO...’ is not 
acceptable in our algorithms, unless it can be translated into one of the type 
‘FOR i =1 TO n DO...’ and therefore ‘disentangled’ into a known number of 
sequential operations to avoid cycles in our graph. 

The idea of complexity of an algorithm is related to the time it would take 
the algorithm to perform the desired task. The more ‘complicated’ our graph 
is, the longer the time it will take. So, a first measure of complexity to be 
taken into account may be the number of nodes in the graph. This will be the 
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notion of complexity we are going to use throughout these notes, also known 
as sequential complexity. 

Needless to say, this measure of complexity is not very accurate. For ex- 
ample, it is much simpler for a machine to perform the sum 1+ 1 than to 
add two huge numbers but our measure of complexity does not take this into 
account. Moreover, it is generally quicker to compute a sum than a product. 
These considerations give place to a number of different kinds of complexities 
(non-scalar complexity, bit complexity, etc) which we will not take into ac- 
count. But, of course, if an algorithm has a very high complexity in our terms, 
then it will be useless to try to run it on any computer. 

There are other possible variables to be taken into account when consider- 
ing the feasibility of an algorithm: for instance, the space in memory needed 
to perform it or whether it is well-parallelizable (that is to say, roughly speak- 
ing, whether it can be run fast enough provided we can use simultaneously 
a considerable number of processors at a time, or more precisely, that the 
depth of the algorithm is polynomial in the log of its sequential complexity). 
However, our approach to the subject is intended to be basic and we are not 
going to consider these aspects in this chapter either. 

To run an algorithm, we need to encode some given data: for the moment, 
we will refer to the number of nodes we need to encode the input data our 
algorithm can deal with as the size of the input. This size generally depends on 
some quantities such as the number of variables and the number and degrees 
of polynomials involved. We will say an algorithm is polynomial when its 
complexity is bounded by a polynomial function in the size of the input. 

We will also use the usual O notation to express orders of complexities: 
given two functions f : N— N and g: N —N, we say that f = O(g) if and 
only if there exists k € N such that f(a) < kg(a) for alla EN. 


6.3 Dense encoding and algorithms 


As we are trying to solve algorithmic problems involving polynomials, we 
need to encode them somehow. The first (and most naive) way of encoding 
a polynomial is to copy the usual way a polynomial is given: as a sum of 
monomials. To do this in a way a computer can understand it, we need to 
know a bound for the degree of the polynomial and the number of variables 
involved in advance. Then, we should order somehow all the monomials of 
degree less than or equal to the known bound for the degree in the number 
of variables involved. Once this is done, we can encode the polynomial as the 
vector of its coefficients in the preset order. 

For example, let f(X,Y) = X?-2XY + Y?+3 bea polynomial we want 
to encode. As we know deg(f) = 2, we only have to store the coefficients of the 
monomials up to this degree. We previously fix an order for all the monomials 
up to degree 2 in two variables, for example (1, X,Y, X?, XY, Y?) and, using 
this order, the polynomial f will be encoded as (3,0,0,1,—2,1). 
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This way of encoding polynomials is called the dense encoding. 

Let f be a polynomial of degree bounded by d (d > 2) in n variables and 
let us consider how many coefficients it has, that is to say how many numbers 
will be needed to encode it (i.e. its size when considered as an input), provided 
we are given a previous monomial ordering. According to our definition, we 
have to compute how many monomials in n variables of degree bounded by d 
there are, and the exact number is ews If we consider that we are working 
with a fixed number of variables n but that the degrees can change, taking 


d > 2, we have that 
d+n d+i 
— <2 in 


1<i<n 


Furthermore, asymptotically in d we have that these two quantities are of 
the same order because 


qd” 
————___ < n! 
d ——— 
Thizien 
and this is why we say that a polynomial of degree d > 2 in n variables has 
O(d") coefficients. 


6.3.1 Hilbert’s Nullstellensatz and dense encoding 


As we have seen in Section 6.1.1, an effective Hilbert’s Nullstellensatz is any 
algorithm that, given as input the polynomials f1,...,f, € Q[X1,...,Xnl, 
decides whether there exist polynomials g),..., 95 € Q[X1,..., X»] such that 


y feet (6.1) 


1<i<s 


and computes a particular solution (g1,...,gs) to this identity. 

The first step may be to find a bound for the possible degrees of some 
polynomial solutions gi,...,gs to Equation (6.1) as a function of s,n and a 
bound d for the degrees of the polynomials f;,..., f;. If we are able to do 
so, our problem can be easily transformed into a linear algebra problem: we 
could write new variables for the coefficients of the polynomials g,...,9s 
up to the degree we found as a bound and Equation (6.1) would turn into 
a linear system by identifying the coefficients on the left with those on the 
right. That is why some authors consider the following problem an effective 
Hilbert’s Nullstellensatz: 

Show explicitly a function yp: N? +N satisfying the following property: 

Let fi,..., fs € Q[X1,..., Xn] such that deg(fi;) <d (1 <i<s). Ifle 
(fi,---,fs), there exist polynomials g1,...,9s © Q|X1,...,; Xn] with deg(g;) < 
p(n, s,d) (1<i<s) such that oy cj;e, 9-fi = 1. 

In the case the polynomials we obtain by homogenizing f,,..., f, have no 
common zeros at infinity, the Fundamental Theorem of Elimination Theory 
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(see [Laz77] and Chapter 1, for example) shows that y(n, s,d) < n(d—1) +1, 
but in the general case this bound does not work. 

Just as an example, we are going to show a very elementary result of this 
kind, where we obtain bounds similar to the ones obtained by G. Hermann 
[Her26], whose proof was corrected in [MW83]. 


Theorem 6.3.1. Let f1,...,f5 € Q|X1,...,Xn] such that deg(f;) < d (1 < 
i<s). If1e(fi,...,fs), there exist polynomials gi,...,9s © Q[X1,.--, Xn] 
with deg(gi) < (3d)2""" (1 <i <s) such that Vi<ics G-fi = 1. 


Proof. We shall prove this theorem using induction on n. 

For n = 1, let fi,..., fs € Q[X] and suppose deg f; = d > deg f; (2 < 
i<s).Ifl= So c;<, hi-fi, applying the division algorithm by f; in QLX], we 
have 

i= fratri (2<i<s). 


Then we obtain, rearranging the sum, that 


1 = fia + os of) + s Seri: 


2<i<s 2<i<s 


As degr; < d—1(2 <i < s), we have that deg(fi.(hit) 5<;<, %i-fi)) < 2d—-1. 
Therefore, calling g) = hi+ ooc;<, G-fi and gj = rj (2 <i < s) we get that 
1= ici, 9i-f; and degg; <d—1(1<i<s). 

Suppose now the result is true for n. Let f1,..., fs € Q/X1,.-.,; Xn41] be 
such that deg(fi) < deg(fi) =d (2 <i<s). 

We want to deal with polynomials which are monic with respect to a vari- 
able. To do so, consider the following change of variables (where A2,..., An are 
new parameters): X, = Y,, Xo = Yo+A2¥1,..-,Xn = Yn +AnM1. The polyno- 
mials we obtain when applying this change of variables have maximum degree 
in Y; and their leading coefficients in this variable are the homogeneous parts 
of maximum degree of the original polynomials evaluated in (1, A2,..., An). 
Choosing a suitable n — 1-tuple such that these homogeneous parts do not 
vanish, we get the desired linear change of variables. 

So, without loss of generality, we can suppose every polynomial f; is monic 


in X,. Introduce new variables U;,...,Us;,Vi,...,Vs; and consider the poly- 
nomials 
Pe= >> Uif, and G:= S- Vifi in QU,V][X1,..-,;Xn4i]- 
1l<i<s 1l<i<s 


The resultant of these polynomials with respect to the variable X1, 
Resx, (F,G) € QU, V][Xo,..-,Xn41] 


is bi-homogeneous in the groups of variables (U,V) of bi-degree (d,d). We are 
going to prove that, if we write 
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Resx, (F,G) = S~ ha,e(Xa,-.-Xn4i)UeV?, 


a,B 
fi,-.-,fs have a common root in C”t! if and only if (ha,6)\a|=d,|a|=a have a 
common root in C”. 
If (%1,...,%n41) € C"*! is a common root of f,,..., fs, then 


Resx, (F,G)(a2,...,%n41)(U,V) = 0 


and therefore, (ha,g)\o|=d,|4|=a have a common root in C”. 


On the other hand, if (x2,...,2n41) is a common root of (ha,g)|a|=d,|4|=d> 
consider the polynomials F' and Gin Q(Wi,...,Us,Vi,..., Vs)[X1, X2,---,; Xn4i]. 
Then, F'(X1,%2,.--,%n41) and G(X1,22,...,%n+41) share a common root in 


Q(U,V). But, as the roots of F\(X1, #2,...,%n41) lie in Q(U) and the roots of 
G(X1, ©2,..-,%n41) lie in Q(V), the common root must be in C. That is, there 
exists x, € C such that F(21,...,2%,41) = 0 and G(21,...,2n41) = 0. As the 
variables U,V are algebraically independent, we conclude that (21,...,2%n+1) 
is a common root of the polynomials f;,..., fs. 

Then we have reduced the number of variables by one. Note that, because 
of Hilbert’s Nullstellensatz, we have shown that 


Le (fi,---, fs) <> 1€ (hag) \al=a,|6|=a- 


Resx, (F,G) can be written as a linear combination of F and G. Taking 
into account the degrees of the polynomials involved, we can state that there 
exist polynomials R and S$ in Q[U,V][X] of degree bounded by 2d? in the 
variables X such that Resx, (F,G) = RF' + SG. Rewriting this identity into 
powers of U and V, we have that 


hep = >) po fi 


1l<i<s 


where the polynomials p‘*%) have degrees bounded by 2d?. Using the in- 


ductive hypothesis for the polynomials (ha,s)\a|=a,|3|=a Whose degrees are 
bounded by 2d?, the theorem follows. 


Evidently, this kind of bound is not good for algorithmic purposes. There 
are much better bounds for the degrees of the polynomials appearing in the 
Nullstellensatz but the proofs are beyond the scope of this survey. Brow- 
nawell, in [Bro87], obtained the first single exponential bound y(d,n,s) = 
3min{n, s}nd™™{"-s} in the characteristic zero case. Then, in [Kol88] and 
[FG90] the most precise bounds known up to now for any characteristic were 
found: y(d,n, s) = (max{3,d})”. In [S595] a better bound for the particular 
case when d = 2, namely (2,n,s) = n2"*?, was shown. 

More precise bounds involving other parameters than d, n and s were 
obtained in [Som97], [KSS97] and [GHM798] (see Section 6.6.2). 
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Let us make a final comment on the complexity of an algorithm that, using 
the dense encoding of polynomials, decides whether the variety they define is 
empty or not and, if it is empty, gives as output a linear combination of the 
input polynomials equal to 1. 

If the input polynomials f;,..., jf; have degrees bounded by d and the 
bound for the degrees of the polynomials involved in the linear combination 
given by the Nullstellensatz is y(d, n, s), then we only need to solve a system of 
O ((y(d,n, s) + d)”) linear equations in O(sy(d,n,s)”) variables (or to prove 
that this system has no solution). The complexity of doing this, using the 
techniques in [Ber84] and [Mul87], is of order O(s*.(y(d,n, s) + d)*”). 

Therefore, using the best Effective Nullstellensazte known up to now, that 
essentially state y(d,n,s) = d", the complexity of any algorithm using dense 
encoding will be at least of order O(sd"’) (see Proposition 6.3.4 below). 


6.3.2 Quantifier elimination and dense encoding 


Suppose now we are given s +t polynomials in Q[X1,..., Xn][Yi,---, Ym] of 
degrees bounded by d and we want to give algorithmically a quantifier-free 
formula equivalent to 


yeC™: filx,y) =OA---A fs(t,y) =ON gi(z,y) FOA-+- Agia, y) #0. 

(6.2) 

Rabinowicz’s trick allows us to consider only equalities by means of a new 
indeterminate Z and therefore, the previous formula is equivalent to 


yeC™ 3z€C: filz,y) =0A---A fs(z,y) = 0A (1 z. it gi(2,y)) = 0. 


1<i<t 


LW 


For a fixed « € C”, using Hilbert’s Nullstellensatz, this last formula is 
equivalent to 


Mise gD Bed SC Vict ory Vege) / l= Pifi t+ Doi(l — Z. i gi). 


1<i<s 1<i<t 


Any effective Hilbert’s Nullstellensatz providing upper bounds for the de- 
grees of the polynomials p; involved allows us to translate this last formula 
into a quantifier-free formula in the coefficients of the polynomials f; and g; 
by means of linear algebra. Suppose the linear system involved is A.X' = B 
where A € C’** and B € C*. The non-existence of solutions is equivalent 
to the condition rank(A) # rank(A|B). Using that the rank of a matrix can 
be computed by means of the determinants of its minors, this last condition 
can be translated into a (very long) formula involving A, V, equalities and 
inequalities to zero. This formula works for every « € C” and therefore, this 
formula is equivalent to (6.2). 
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It is evident that the better the effective Nullstellensatz we are using, the 
smaller the complexity of this kind of algorithm will be, provided we compute 
the rank of the matrices involved in a smart way (for example, using the 
algorithm in [Mul87]}). 

Given a first order prenex formula y (‘prenex’ meaning that there are sev- 
eral blocks of existential and universal quantifiers placed at the beginning of 
the formula) with coefficients over an algebraically closed field, let |y| be its 
length, i.e. the number of symbols needed to encode y, let n be the number 
of indeterminates involved, let D be one plus the sum of the degrees of the 
polynomials that appear in y and let r be the number of blocks of quantifiers. 
Heintz and Wiithrich (see [Hei83] and [HW75]) exhibited elimination algo- 
rithms for algebraically closed fields of given characteristic with complexity 
bounded by |yjp"°™ In fact, in the 1940s, Tarski already knew the exis- 
tence of elimination algorithms but he did not describe them explicitly (see 
[Tar51]). Later, using the fundamental techniques described in [CG83] and 
[Hei83], Chistov and Grigor’ev considered the problem for prenex formulae 
and obtained in [CG84] and [Gri87] more precise complexity bounds of or- 
der le|pre However, these bounds depend on arithmetic properties of the 
base field involved because polynomial factorization algorithms are used as 
subalgorithms. None of the algorithms mentioned before are efficiently well- 
parallelizable. Finally, in [FGM90], a well-parallelizable elimination algorithm 
within the same sequential complexity bounds obtained in [CG84] and [Gri87] 
is constructed combining the methods in [Hei83] with some effective versions 
of Hilbert’s Nullstellensatz (see Section 6.3.1). Moreover, the complexity of 
this algorithm does not depend on particular properties of the base field k. 
Later, the same result was obtained in [Ier89]. In the context of quantifier 
elimination, it is also worth mentioning the work of Renegar (see [Ren92]) on 
elimination over real closed fields since the bounds obtained there are very 
sharp and imply the bounds for elimination over complex numbers. 


6.3.3 Equidimensional decomposition and dense encoding 


Different algorithms describing decompositions of an algebraic variety V have 
been given. Chistov and Grigor’ev (see [CG83]) exhibit an algorithm for the 
computation of the irreducible decomposition provided an algorithm that fac- 
torizes multivariate polynomials with coefficients in the base field is given. 
Giusti and Heintz (see [GH91]) present an algorithm for the equidimensional 
decomposition of algebraic varieties which is well-parallelizable. Although we 
do not include the proof of this last result here, we can state their main 
theorem and the complexity obtained: 


Theorem 6.3.2. Let fi,..., fs be polynomials in Q(X1,...,Xn] of degree 
bounded by d and let V be the variety they define. There exists an algorithm 
of complexity s°d0\"") which computes, for every 0 <i <n, d?) polynomi- 
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als of degree bounded by d” defining the equidimensional component of V of 
dimension 1. 


A more recent algorithm to decompose an algebraic variety using Bézoutian 
matrices can be found in [EM99a]. However, the decomposition obtained there 
may not be minimal (embedded components may appear) and the algorithm 
is probabilistic (see Section 6.6.1). 


6.3.4 A lower bound 


In this section, we are going to show that the better bounds already obtained 
(and mentioned before) for the efficient Hilbert’s Nullstellensatz are of the 
best possible order. 

To do so, we are going to state a very well-known example by Masser and 
Philippon (see [Bro87]) that gives a very high lower bound for the degrees of 
the polynomials appearing in the Nullstellensatz: 


Example 6.3.3. Take the following polynomials in Q[X1,..., Xn]: 
fr = XY, fo = X1 — XY,..., fn = Xn-2 — Xa fn = 1— Xn-1Xy 


If g1,---,9n € Q[X1,...,X,] are polynomials such that 1 = $°,...,, afi; 
consider a new variable T and evaluate the polynomials in the following vector 
of elements in Q(T): 


(Pee a TA A/T), 


Note that, under such evaluation, all the polynomials f; vanish for 2<i<n 
and so we have that 


i (ra-ve? _ Tt) Lee, 


This identity implies that degy (gi) > (d—1)d"~1 and therefore deg g; > 
(d—1)d"—"}. 


This simple example shows that, with the notation above, a lower bound 
for the degrees of the polynomials g; appearing in the expression 1 = 
Vicses Gi-fi is d-™, and therefore we have 


Proposition 6.3.4. Any general algorithm that, from an input of s polyno- 
mials fi,...,fs € QUX1,...,Xn] of degrees bounded by d, computes (provided 
they exist) polynomials 91, ...,95 € Q[Xy,...4X_) such that 1 =} yee, Gide 


and encodes them in dense form must have complexity of order at least O(d"’). 
Moreover, in [FGM90], it is shown that, from the point of view of overall 


complexity, the complexities they attain for the quantifier elimination algo- 
rithm are optimal when using dense encoding. In fact, they prove the following 
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Theorem 6.3.5. There exists a sequence of first order formulae (containing 
quantifiers and two free variables) y,, (k € N) over an algebraically closed field 
with the following properties: 


lyx| = O(k) 

For each quantifier free formula 0 equivalent to yr involving the polyno- 
mials F,,...,F, there exists i, 1 <i<s, such that deg F; > 2 where 
c>0 is a suitable constant. 


Note that this theorem states lower bounds for the degrees of the poly- 
nomials appearing in the output formula, and the greater the degrees, the 
greater the number of nodes needed to encode them. 


6.4 Straight-line Program encoding for polynomials 


6.4.1 Basic definitions and examples 


The comments in Section 6.3.4 show us that it is impossible to obtain more 
efficient general algorithms when dealing with dense encoding of polynomials. 
There are at least two ways of avoiding this problem: the first one is to change 
the form the polynomials are encoded (that is to say, to try to find a shorter 
way for encoding polynomials) while the second one is to design non-general 
algorithms which can only solve special problems but within a lower com- 
plexity. We will now discuss the first of these: changing the way we encode 
polynomials. 

One attempt that has been made to change the representation of poly- 
nomials is the so-called ‘sparse’ encoding, which consists in specifying which 
monomials of a given polynomial have non-zero coefficients and which are 
these coefficients. Suppose a polynomial P has only a few monomials with 
respect to its degree. The sparse encoding will consist of a number of vectors 
which specify the (non-zero) coefficient of every monomial appearing in P. For 
example, if P = 2X !5Y4+2X7Y? — 3X? 4+ 1, it can be encoded by a vector 
of four three-tuples, one for each of the monomials appearing in P. In each 
three-tuple, the first coefficient would stand for the degree of the monomial 
in X, the second one for the degree of the monomial in Y and the third one 
would be the coefficient of the monomial, that is, P would be encoded in the 
following way 


P := ((15, 4, 2); (7,3, 2); (2,0, —3); (0,0, 1)) 


instead of using a vector of (3) = 210 coordinates. 


This way of encoding polynomials has proved to be efficient when dealing 
with particular families of polynomials (see, for example, Chapter 7 and Chap- 
ter 3) and there is a lot of theory and many algorithms that use the sparse 
encoding. For a complete background of this theory (including sparse resul- 
tants, Newton polytopes, toric varieties and Bernstein theorem, among other 
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interesting and very useful notions) we suggest the reader refer to [CLO98}, 
[GKZ94] and [Ful93]. 

However, it is not clear whether it is worth it to use this sparse encoding in 
a general algorithm: the output polynomials may have too many monomials. 
Moreover, the sparse encoding does not behave well under linear changes of 
coordinates in the sense that a ‘short’ polynomial in the sparse form can 
change into a very ‘long’ one by means of a linear change of variables: note 


that 100 

X +Y)100 — ( )xtyrors 

mos . 
(that is, a single monomial may turn into a polynomial with many monomials 
under a linear change of variables). 

An alternative way to encode polynomials (the one we are going to study 
here) is based on the following idea: 

Let P be the polynomial P := (X + Y)!°9 — 1. Why can we define this 
polynomial so easily (that is to say, using a small number of symbols) but it 
takes so much space to encode it for a machine (in both the sparse and the 
dense encoding)? 

The answer perhaps is that we are used to thinking of a polynomial as 
a ‘formal expression’ rather than a function that can be evaluated. But, as 
far as fields of characteristic zero are concerned, polynomial functions and 
polynomials can be considered as the same objects. Therefore, if we define a 
polynomial function by defining its exact value at every point (that is to say, 
by means of describing how to evaluate it), we will be defining a polynomial. 
In the previous example, the polynomial P would be the only polynomial in 
Q|X, Y] such that, to evaluate it at a pair (#,y), you have to compute the 
sum of x and y to the 100-th power and subtract 1 from the result. This way 
of encoding a polynomial will be called a straight-line program. Let us put 
these ideas more precisely: 


Definition 6.4.1. Let X1,..., Xp, be indeterminates over Q and let REN. 
An element 8 := (Q1,.--,Qr) € Q(X1,...,Xn)*® is a straight-line pro- 
gram (slp for short) if each Q, satisfies one of the following two conditions: 


Qo € QU {Xy,.+.,X_} or 
dpi, p2 <p and * € {+,—,-,+} such that Qp = Qp, * Qpo- 
We say @ is a division-free slp if Q, = Qp, + Qp. > Qp. € Q— {O}. 
From now on, we are only going to deal with division-free slp’s. Note 
that, in this case, each element Q, is a polynomial in Q[X),...,X,]. If F € 
{Qp /1<p< R}, we say that 6 computes or calculates F. 
There are several measures of complexity that can be taken into account 
when considering slp’s. For example: 


e The total length of @ (denoted by L(3)) is the quantity of operations 
performed during the slp @ (more precisely, it is the number of coordinates 
Q, defined as the result of an operation between two previous coordinates). 
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e The additive length of G (L4()) is the quantity of sums and subtractions 
performed during the slp. 

e The non-scalar length of G (Le()) is the number of products between two 
non-rational elements performed during the slp. 


Given any polynomial F’ € Q[X1,...,X,] we will define its total length 
(also called total complexity) as 


L(F) := min{L(f) / @ is an slp computing F’}. 
We can respectively define Li(F’) and Loe(F). 


Exercise 6.4.2. Prove that, for any F € Q[X),..., Xn], L(F) = O((Lo(F))’). 


From now on, unless it expressly stated, we will only consider the total 
length of an slp of a polynomial and, for the sake of shortness, we will simply 
call it its length. 

As an example, we are going to show an slp that calculates the polynomial 
F(X) =14X4X?4X34..-4 X?'—! efficiently. Of course, we can compute 
every power of X and then add them up, but it would yield an slp of length 
2i+1 _ 3, A better slp computing F’, based on the binary expansion of any 
positive integer up to 2/ — 1 is the following one : 


gi-t gi-t 


rs Ge. Oe ae. comes: tr ae 


(1+ X)1+X%), (14+ X)1+X704+X4,..., TT a+ x*)) 
0<i<j-1 
and L(@) = 37 —2. 


Another well-known example of slp encoding a polynomial is Horner’s rule 
for univariate polynomials: 


aga, X+agX7+---+agX? = (agp +X (a, +X (ag+X(...(ag_itagX)...))). 


The length of this slp is 2d and it involves d products and d sums. It 
can be proved that the number of sums and the number of products involved 
in any slp computing this polynomial are bounded by d when the elements 
X,do,--.-,@q are algebraically independent (see, for example, [BCS97]). 


Exercise 6.4.3. Let P € Q[X,Y] be a polynomial whose sparse encoding is 
P = ((m4,171,¢1),---, (Ms, %s,Cs)). Find a bound for L(P). 


Exercise 6.4.4. Find an infinite family of polynomials in Q[X, Y] such that 
the number of nodes needed to encode each of them into the sparse form is 
(much) greater than its length. 


Exercise 6.4.5. Given a generic polynomial of degree d in Q[X, Y], find an 
upper bound for its length. 
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Exercise 6.4.6. Try to generalize the three previous exercises to the case of 
n-variate polynomials. 


A last comment has to be made about the complexity of algorithms when 
dealing with straight-line programs. An slp can be obviously considered as a 
directed acyclic graph without branchings and, therefore, it has nodes. The 
complexity of an algorithm using the slp encoding will be the total number of 
nodes, that is to say, the ones arising as operations or comparisons plus the 
internal nodes of the slp’s involved. 


6.4.2 Some apparent disadvantages 


When we are dealing with slp’s to encode polynomials, we face a fundamental 
problem: the same polynomial may be encoded by means of many different 
slp’s. So, it is not straightforward to verify a polynomial identity. 

Suppose you are given an slp of length LZ that evaluates a polynomial F 
in n variables of degree bounded by d. If you want to know whether F = 0, a 
naive attempt would be to interpolate F', but it would take so many points to 
do so that the complexity of doing this would again be too large (within the 
same order as the number of nodes needed in the dense representation of F’). 

Another way to solve the problem is to find a smaller particular set of 
points such that two polynomials of bounded length and degree coincide if 
and only if they coincide when evaluated in all these points. Luckily, there is 
a result due to Heintz and Schnorr stating the existence of this set: 


Theorem 6.4.7. (see [HS82]) Let W(d,n,L) C Q[X1,...,Xn] be the set of 
polynomials of degree bounded by d that can be calculated by means of an slp 
of length L. Let Cc Q be a set of 2L(1+ d)? elements. Then, there exists a 
set of points {a1,...,Q@m} CI” withm = 6(L4+n)(L+n+1) satisfying 


F € W(d,n,L) such that F(a;) =0V1<i<m=>F=0. 


The set {a1,...,@m} is called a correct test sequence or a set of questors. 
Unfortunately, we do not know how to construct such a set within a reasonable 
cost. A way to avoid this problem is to consider probabilistic algorithms (which 
we will briefly discuss later in Section 6.6.1). 

Another question we can ask is how many polynomials can be evaluated 
easily (that is to say, can be calculated by means of short slp’s). The answer 
again, as we are going to see now, is not very encouraging (see [Sch78] and 
[HS80)): 

For fixed n, d and JL, let us consider the set of all the polynomials F’ € 
Q|X1,..., X»] with deg(F’) < d and non-scalar length Le(F) < L. 

Observe that each of these polynomials can be computed by a ‘non-scalar’ 
slp (that is to say, the only coordinates we are taking into account in this slp 
are the products between non-scalar elements) 


6 Algebraic Complexity 259 


B = ( Paitlonag Pia ixene AE) 


where B_»4; = X; (1 <i <n) and, defining G_, := 1, 


By, = x al”) B; . i OW”) 8, 


—n<j<ck-1 —n<j<k-l1 


Considering new variables Ay and fee (Al<k<D; -n<j<k- 


1), there exist polynomials Qh) E QA”, BM”) such that the coefficients 
of any polynomial F;, that can be computed in the k-th step of ( are the 
specializations of these polynomials in some rational vectors a and }, that is 
to say 


Fy = 55 Q@ (a, b)X*. 


So we have 


Proposition 6.4.8. For every L, n € N there exist Qa € Z[T,,...,Tm] poly- 
nomials with m = (L+n)(L+n+1), a € (No)”, la] < 2”, deg Qa < 2la|L 
such that for every F € Q|X1,...,Xn] satisfying Lo(F) < L, 


P= S > Qalt)X° for some t € Q™. 


Now, we can consider the morphism obtained by evaluating the family of 
polynomials (Qa : |a| < d): 


(Qa : lal <4): CetmMEtn+) _, CCH) 
Therefore, if F := 0, caX® € Q[X,...,Xn] is any polynomial that has 
deg(F’) < d and non-scalar length Le(F’) < L, considering it as the vector 
(Ca) € Qt"), it turns out that F € Im(Qa : lal < d). 

As a consequence, we have that, for fixed d,n, L € N, the set 


W(n,d,L) := Im(Qa : a € (No)”, Ja] <a) c Cl") 


is a closed set that contains all the vectors of coefficients of polynomials F' € 
Q[X1,...,X»] such that deg F < d and Lg < L. 

A very important remark is that, as W(n,d, L) is defined by means of a 
polynomial function in (Z+-n)(L+n-+1) variables, its dimension is bounded by 
dim W(n,d, L) < (L+n)(L+n+1). This can be interpreted in the following 
way: the polynomials of degree bounded by d with non-scalar complexity 
bounded by L considered in C("*") (a space of dimension ("**)), lie in a 
variety of dimension (L+n)(Z+n+41). So, as long as L satisfies (Z+n)(L+ 
n+1)< lgeag there are very few polynomials easy to evaluate since the 
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complement of the variety they lie in is a non-empty open set in the Zariski 
topology. Therefore, most polynomials are difficult to evaluate. 

Taking these last observations into account, one may wonder if it would 
be useful to deal with slp’s when trying to solve polynomial equations. The 
answer is affirmative as we will see in the following sections. 


6.4.3 A fundamental result 


In [GH93], Giusti and Heintz obtain a fundamental result using for the first 
time straight-line programs to solve algorithmically a problem related to solv- 
ing a system of polynomial equations. In that paper, they give a polynomial 
algorithm that can decide whether a given algebraic variety V is empty or 
not from the polynomials defining V encoded in dense form. In fact, they go a 
little further: given polynomials, encoded in dense form and defining a variety 
V, they can find the dimension of V algorithmically in polynomial time. 

In a first step, they design an algorithm that, given polynomials defining 
a variety V, computes a variety Z, either zero-dimensional or empty, satis- 
fying the following conditions (Vo will denote, as usual, the zero-dimensional 
equidimensional component of V): 


e VC ZCYV (that is to say, all the isolated points of V are in Z and all 
the points of Z are points in the variety.) 

e The way Z is presented makes it ‘easy’ to decide whether it is empty or 
not (a more precise description of this way of presenting Z will be given 
in Section 6.4.4). 


Note that, if we already know that the variety V is either empty or has 
dimension 0, we can decide if it is empty by means of this result (V =@ <=> 
Z=%). 

The general idea of the algorithm computing the dimension of V is the 
following: suppose the variety V is defined by fi,...,f, € Q[M1,...,Xn]. 
Generally, if it is not empty, when we cut it with a hyperplane Hy, we will 
obtain a variety V M Hy of dimension dim V — 1. Continuing this process with 
‘generic’ hyperplanes, we have that, after dim V + 1 steps, by reducing the 
dimension by one in each step, we get the empty set. Then, as dimV < n, 
when we cut it with n+ 1 ‘generic’ hyperplanes we obtain the empty set: 


VOHLN- +0 Hag = 9. 


So, we have that V0. AiN--- OH, is either the empty set or a variety consisting 
only of isolated points and we are under the required hypotheses to decide 
whether it is empty or not. If it is not empty, then dim V = n. If it is empty, 
we consider the variety VM H,9---MH,_ 1 and repeat the process. After at 
most n+ 1 steps we will know the dimension of V (because it is equal to the 
minimum number of ‘generic’ hyperplanes we have to cut V with to obtain 
the empty set minus one). 
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The sets of n+1 hyperplanes that do not satisfy the desired conditions can 
be considered as elements of a proper closed set in a proper affine space CN, 
that is to say the whole construction we have made works for almost every 
set of n+ 1 hyperplanes. This is what we meant by ‘generic’ hyperplanes in 
the last paragraph. 

The proof of the result by Giusti and Heintz is beyond the scope of this 
survey, but we are going to take into account some of the ideas used there. 


6.4.4 An old way of describing varieties: the Shape lemma 


In [GH93], Giusti and Heintz use a particular way of defining zero-dimensional 
varieties which was already used by Kronecker (see [Kro82]). This way of pre- 
senting the variety is called a shape lemma presentation or a geometric resolu- 
tion of the variety. (This same description is presented under different names 
in other chapters of this book: single variable representation in Chapter 2, 
univariate representation in Chapter 3 and shape lemma in Chapter 4.) The 
idea of this presentation is quite simple: 

Suppose we are given a zero-dimensional variety Z C C” defined by poly- 


nomials in Q[X,,..., Xp] and consisting of D points 
a) = (2), sites oi), 2. 0(P) = (2), dus ,a\P)), 


Suppose also that their first coordinates are all different from one another. 
Therefore, we can obtain a polynomial Q € Q|T] of degree D whose zeroes 
are exactly these first coordinates; namely 


Q= J] w-2!). 


1<i<D 


Moreover, using interpolation, fixing an index j, (2 < j < n), there exists a 
unique polynomial P; € QT] of degree bounded by D—1 such that P; (x) = 
a) for every 1 <i < D. Then, 


Z={x EC" / Q(a1) = 0A x2 — Po(a1) =OA-++ Aan — Py(xi) = OF. 


This parametric description of Z (note that all coordinates are parame- 
trized as functions of x,) has the additional property of telling us how many 
points are in Z (this quantity coincides with the degree of Q). 

The only inconvenience of this description is that we need the first co- 
ordinates of the points to be different from one another and this is not al- 
ways the case. The way to solve this is to consider an affine linear form 
€(X) = up + ur X1 +++ + unXn in Q(X1,...,Xn] such that (2) are all 
different from one another (in this case we say either that ¢ is a primitive 
element of Z or that @ separates the points in Z). 

Now, we are able to define what we call a geometric resolution of a zero 
dimensional variety: 


262 J. Sabia 


Definition 6.4.9. Let Z = {2,...,2°} CC” be a zero-dimensional vari- 
ety defined by polynomials in Q|X,,...,Xn]. A geometric resolution of 
Z consists of an affine linear form €(X) = uo + urX1 +--+ + UnXn in 
Q[X1,...,Xp], and polynomials Q, Pi,..., € QT] (where T is a new 
variable) such that: 


e Lc) Zee) fi Fk. 
e Q(T) = Th<icp(f a &(a)) 
e Forl<j<n, deg Pj) < D—1 and 


Z = {(Pi(),---,Pn(€)) / € € C such that Q(f) = 0}. 


Py 


As this description of Z is uniquely determined up to @ we call it the 
geometric resolution of Z associated to @. 

For the sake of simplicity, we also define the notion of geometric resolution 
for the empty set, and in this case, the polynomial Q is 1. 

Although this definition is quite easy to understand, the problem underly- 
ing it is to find (given the zero-dimensional variety Z C C” defined by polyno- 
mials f,,..., f;) a proper linear form and the polynomials Q, P;,..., P, (note 
that our definition is based on the coordinates of the points in Z!). 

In [GH93] Giusti and Heintz do not find the exact geometric resolution 
of the isolated points of a variety V but they are able to find a linear form 
£ which separates the isolated points of V, a polynomial which vanishes over 
the specialization of @ in the isolated points of V and, by means of them, 
they find a geometric resolution of a zero-dimensional variety Z, satisfying 
Vo C Z CV. Given the polynomials defining V, in a first step they introduce 
a new variable to make a deformation in order to reduce the problem to 
the case of a zero-dimensional projective variety. Then, using some ideas and 
results of [Laz77] about the regularity of the Hilbert function of a suitable 
graded ring and some linear algebra algorithms ([Ber84] and [Mul87]), they 
obtain the characteristic polynomials of several linear maps which allow them 
to get the desired geometric resolution. 

Note that, if the variety V is zero-dimensional, the algorithm of [GH93] 
computes a geometric resolution of V. For an improved and more detailed 
version of this construction of a geometric resolution of a zero-dimensional 
variety from polynomials defining it, see [KP96]. 


6.4.5 Newer algorithms, lower bounds 


The paper we have already mentioned ([GH93]) is a milestone in the devel- 
opment of algorithms solving polynomial equations symbolically. The main 
theorem proved there is: 


Theorem 6.4.10. There exists an algorithm that, given polynomials fi,..., fs 
in Q(X1,...,Xn] of degrees bounded by d in the dense encoding defining an 
algebraic variety V CC", computes dim V within complexity s°d°™), 
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Note that this result allows us to answer the first question concerning a 
polynomial equation system (whether its set of solutions is empty or not) just 
by computing its dimension. 

Some of the problems stated have since been solved within polynomial 
time (that is to say, by means of polynomial algorithms), by using different 
tools. 

In [GHS93], given polynomials f;,..., fs € Q/X1,..., Xn] such that the va- 
riety they define is empty, a family of polynomials gi,...,gs € Q[X1,..., Xn] 
such that 1 = S°,2;<, 9i-f; holds is constructed. The polynomials g1,..., 9s 
have degree bounded by d°(") and are obtained in an slp encoding. The 
complexity of the whole algorithm is s°™d°“™ (compare with the end of 
Subsection 6.3.1). In [FGS95] the same problem is re-considered by using du- 
ality theory and a complete different algorithm is designed so that the new 
polynomials g1,...,gs; obtained have degree bounded by d?“), 

A quantifier elimination algorithm using slp’s was obtained in [PS98]. The 
main result there is more general than the one we have stated above, but 
adapted to our case it would essentially mean that the elimination stated 
before can be done in polynomial time in the size of the input. 

We can also mention polynomial algorithms for the equidimensional de- 
composition of varieties(see [JS02] and [Lec00]). However, these algorithms are 
probabilistic (see Section 6.6.1 for a brief account on probabilistic algorithms). 


6.5 The Newton-Hensel method 


The use of slp’s as a way of encoding polynomials made it possible to adapt 
algorithmically a very well-known concept, the Newton-Hensel method, which 
can be seen as a particular version of the implicit function theorem. (Compare 
with the Hensel operator defined in Chapter 9.) 

Let T1,...,7m,X1,---, Xn be indeterminates over a field Q. Given t € C™, 
T —t will represent the vector (Ti — t1,...,Tin — tm). 

Let fi,.--,fn € Q[L, X] be polynomials. We will denote by f the vector 
of polynomials (f1,..., fn), by Df the Jacobian matrix of f with respect to 
the indeterminates X and by Jf its determinant. 


Lemma 6.5.1. Let fi,..., fn € Q(T, X] and let (t,€) € C™ x C” such that 


filt,€) =0,..-,fn(t,€) = 0 and Jf(t,&) 0. 


Then, there exists a unique n-tuple of formal power series R = (R1,...,Rn) € 
C[[T — t]]” such that: 


© ACER) =0..0,5,(7 B)=0 
© R(t) = (Rilt),...,Rn(t)) = €. 
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Proof. (This is only a sketch; for a very detailed proof of this fact, see for 
example [HKPT00].) 

Given f(X) = (fi(T, X),.-.,fn(L, X)), we define the Newton-Hensel op- 
erator associated to it as 


N,(X)! = Xt — Df(X)-*. F(X). 


Note that Jf(X) is not the zero polynomial (from our hypothesis, Jf(t,€) 4 
0) and, therefore, our definition makes sense. 
We define the following sequence of rational functions: 


RO := fa 
R*) := Ne(R&V) = NF(E) for kEN 


The first thing to take into account is whether we can define this sequence 
(that is to say, if we do not try to divide by zero) but this fact can be induc- 
tively proved using that R(t) = €. 

The following conditions are fulfilled (this can be proved recursively): 


e f(T,R®) € (T —t)? C CIT — #]] for every 1<i<n 
1 il - RO € (T —t)” CCI — dl] for every 1<j<n 
where (TJ —t) indicates the ideal in C[[Z'—t]] generated by T; —t1,...,Tm—tm. 
Therefore, the sequences (RO) cen are convergent (1 < 7 < n) and the 
n-tuples of their limits R := (R1,..., Rn) is the vector we are looking for. 


Just to show how this works, we are going to discuss an example briefly. 


Example 6.5.2. Given n polynomials of degrees d;,...,d,, in n variables defin- 
ing a zero-dimensional variety V and for a generic linear form @, we show how 
to compute, in many cases at least, the polynomial Q(T) of Definition 6.4.9 
that leads to a geometric resolution of V: 

We consider generic polynomials f),...,f, of degrees dj,...,d, in the 
variables X1,..., Xn: 

fT =>) Tx 
Ja|<di 


Le = ye 
|a|<d, 


(note that each coefficient of f; is a new variable To), 

Consider the variety W := {(21,...,an) € C” / a? -1=0,...,2¢-1= 
0}. Of course, we know all the points in this set: they are n-tuples of roots 
of unity. Let t be the vector of coefficients of the polynomials defining W. 
Therefore we are under the conditions needed to apply Lemma 6.5.1 because 
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we have that, for every € € W, (t,&) is a particular instance of (T, X) that sat- 
isfy the needed hypotheses (it is easy to see that in this instance, J f(t, €) 4 0). 
Then, by applying the Newton-Hensel algorithm we can approximate vectors 
of power series in T — t which will be roots of the original system and we can 
do it as precisely as we want. 

We will have then ||, <;<,, d; different (approximations of) vectors of power 


series that should be all the roots of the original system in C(T) (it can be 
seen that the system we are dealing with has dimension zero when we think 
of T as a set of parameters and Bézout’s theorem states that the number of 
solutions is bounded by |], <;<,, di). 

Suppose that, from every € € W, we obtain the associated solution R¢ € 
C[[T — ¢]]” of the original system. Then 


T] Y - Re) 


cow 


is a polynomial in Q[[T — ¢]|[Y] that vanishes at every point Re. In fact, this 
polynomial is the polynomial of minimal degree defining the image of our 
original variety under the morphism 


QW)" — QT) 


wr l(w) 


As our original variety is definable with polynomials in Q(T)[X], this poly- 
nomial we obtain must be in Q(T)[Y] and therefore, by multiplying it by a 
fixed polynomial h € Q[T] we obtain a polynomial M € Q[T][Y] satisfying 
the following: 


MT A Xi c2y ha) VS Giysiaada) 


(here we are using that the ideal the polynomials f1,..., f,, define is radical). 

Therefore, given n polynomials in n variables defining a zero-dimensional 
variety V, provided the vector of their coefficients tg do not lie in a hyper- 
surface, we can obtain by evaluating M(T,Y) in tg a non-zero polynomial 
M(to, Y) € Q[Y] which specialized in the linear form ¢ vanishes over the ze- 
roes of V. This is a fundamental step we mentioned before (see Sections 6.4.3 
and 6.4.4). 

Of course a lot of work has to be done to succeed in finding this polynomial. 
For example one should know somehow up to what precision the Newton- 
Hensel algorithm is needed, how to compute the polynomial h, and so on, but 
this is just an example of how things work. 


There are two main features to be taken into account when considering the 
Newton-Hensel algorithm. The first one is that an approximation of the power 
series vector up to a given precision can be obtained in very few steps (note 
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that to obtain the series we are looking for up to degree @ we only have to 
apply log, 6 steps of our iteration). The second one is that the Newton-Hensel 
method deals essentially with slp’s. In fact, an algorithmic statement of the 
Newton-Hensel method is the following lemma (see [GHH™97] for a proof): 


Lemma 6.5.3. Under the same hypotheses and notation of Lemma 6.5.1, sup- 
pose the polynomials f,,..., fn, have degree bounded by d and are given by an 
slp of length L. Let x € N, then there exists an slp of length O(Kd?n"L) 
which evaluates polynomials gs”, vag gn? he Q(T|[X] with h(t, 6) #0 
which represent the numerators and the denominator of the rational functions 
obtained in the «-th iteration of the Newton-Hensel operator. 


The Newton-Hensel method has been successfully used to obtain more ef- 
ficient algorithms to solve polynomial equation systems. This tool has been 
introduced in this framework for the first time in [GHM*98], where an algo- 
rithm solving zero-dimensional systems was designed and an effective Null- 
stellensatz was stated. However, these procedures required computing with 
algebraic numbers. In [GHH*97], the first completely rational algorithm us- 
ing the Newton-Hensel method was obtained and the complexity bounds were 
improved in [GLSO1] and in [HMW01]. The Newton-Hensel method has been 
extensively applied to other problems: for example, to solve parametric sys- 
tems (see [HKP*T00] and [Sch03b]) and to obtain equidimensional decompo- 
sitions of varieties (see [Lec00] and [JKSS04]). Some of these algorithms work 
under certain particular hypotheses while the others work for any given input 
probabilistically (see Section 6.6.1). 

Moreover, in [Lec02] an extension of the Newton-Hensel operator adapted 
to the non-reduced case was presented. Then, this extension was applied to 
obtain an algorithm that computes the equidimensional decomposition of a 
variety (see [Lec03}]). 

All these algorithms share an important feature: they all use the Newton- 
Hensel operator, and therefore they can deal with input polynomials codified 
by means of slp’s. 


6.6 Other trends 


In this last section, we would like to discuss briefly some ideas involved in 
algorithmic procedures which have been mentioned earlier. 


6.6.1 Probabilistic algorithms 


Sometimes our algorithms may depend on the choice of an object satisfying 
certain conditions (a linear form separating points, a point where a polynomial 
does not vanish, etc). These choices may be very expensive from the algorith- 
mic point of view. Think of a polynomial f in n indeterminates of degree d. If 
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we want to get a n-tuple v such that f(v) 4 0 we have to check through many 
points. Sometimes, they may even involve a procedure we do not know how 
to accomplish (for example, we know we have to look for a point that is not a 
root of certain polynomial of bounded degree, but we cannot compute exactly 
the involved polynomial). To avoid this, one can choose a random point v to 
go on. Of course, this may lead to an error. Then, a probabilistic algorithm 
would be an algorithm that ‘generally’ performs the task we want accurately, 
but with a bounded probability of error. 

Most algorithms involving slp’s can be considered as probabilistic algo- 
rithms if we do not know an adequate correct test sequence for the kind of 
slp’s involved. In this case, if we want to decide whether an slp represents the 
0 polynomial or not, we just choose a random point and evaluate the slp in 
it. If the result is not zero, we are sure that the polynomial is not the zero 
polynomial but if it is zero, we can suppose that the polynomial is the zero 
one. 

A clear example of this is the following (already mentioned in Sec- 
tion 6.4.3): We have a non-empty variety V and we want to compute its 
dimension. We cut it with a random hyperplane and consider what happens. 
Suppose that this intersection is empty. We would assume that the original 
variety is of dimension 0. It is generally the case, but if we are unlucky and 
the original variety was lying in a hyperplane parallel to the one we chose, 
our deduction would be false. 

In most of the probabilistic algorithms we consider, the generic condition 
a random point should satisfy is that it is not a zero of a given polynomial 
f €QX1,..., Xn] of bounded degree. The random point we choose has integer 
coordinates taken from a finite subset of N big enough. The estimation of the 
probability of success is done by means of the following well-known result (see 
[Sch80] and [Zip79]): 


Lemma 6.6.1. Let R CN be a finite subset. Let f € Q[X1,..., Xn] — {0} be 


a polynomial. Then, for random choices of elements aj,...,dn € R, we have 
that d 
Prob(f(a1,...,@n) = 0) < a 


For example, some of the equidimensional decomposition algorithms al- 
ready mentioned ([Lec00], [JS02], [JKSS04]) are probabilistic. 


6.6.2 Non-general algorithms 


In Section 6.4, we have mentioned that a possible way to avoid the high 
complexities involved in dense encoding was to design specific algorithms that 
would not work for every polynomial system but only for some of them. This 
is already being done, in the sense that some of the algorithms being produced 
in computer algebra may be general but work better (have lower complexity) 
in special cases. This leads us to consider other invariants (not only the degree, 
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quantity and number of variables of the polynomials involved) to compute the 
complexity of the algorithms. Roughly speaking, the new invariants involved 
have to do somehow with the geometry of the varieties involved (that is the 
semantic features of the problem) and not with the way the variety is presented 
(the syntactic ones). For a further discussion on this topic see, for example 
[GHM*98]. Many of the previously mentioned results deal with this new kind 
of invariants (see, for example, [GHH*97], [KSS97|, [HKP*00], [Lec00] and 
[JKSS04]). 
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Summary. Toric (or sparse) elimination theory uses combinatorial and discrete 
geometry to exploit the structure of a given system of algebraic equations. The 
basic objects are the Newton polytope of a polynomial, the Minkowski sum of a 
set of convex polytopes, and a mixed polyhedral subdivision of such a Minkowski 
sum. Different matrices expressing the toric resultant shall be discussed, and effec- 
tive methods for their construction will be described based on discrete geometric 
operations, namely the subdivision-based methods and the incremental algorithm. 
The former allows us to produce Macaulay-type formulae of the toric resultant by 
determining a matrix minor that divides the determinant in order to yield the pre- 
cise resultant. Toric resultant matrices exhibit a quasi-Toeplitz structure, which may 
reduce complexity by almost one order of magnitude in terms of matrix dimension. 

We discuss perturbation methods to avoid the vanishing of the matrix determi- 
nant, or of the toric resultant itself, when the coefficients, which are initially viewed 
as generic, take specialized values. This is applied to the problem of implicitizing 
parametric (hyper)surfaces in the presence of base points. Another important ap- 
plication from geometric modelling concerns the prediction of the support of the 
implicit equation, based on toric elimination techniques. 

Toric resultant matrices reduce the numeric approximation of all common roots 
of a polynomial system to a problem in numerical linear algebra. In addition to a 
survey of recent results, this chapter points to open questions regarding the theory 
and practice of toric elimination methods. 


7.0 Introduction 


Toric (or sparse) elimination theory uses combinatorial and discrete geometry 
to model the structure of a given system of algebraic equations. In particu- 
lar, we consider algebraic equations with a specific monomial structure. It is 
thus possible to describe certain algebraic properties of the given system by 
combinatorial means. This chapter provides a comprehensive state-of-the-art 
introduction to the theory of toric elimination and toric resultants, paying 
special attention to the algorithmic and computational issues involved. Dif- 
ferent matrices expressing the toric resultants shall be discussed, and effective 
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methods for their construction will be defined based on discrete geometric op- 
erations, as well as linear algebra. Toric resultant matrices exhibit a structure 
close to that of Toeplitz matrices, which may reduce complexity by almost 
one order of magnitude. These matrices reduce the numeric approximation of 
all common roots to a problem in numerical linear algebra, as described in 
Section 7.5 and, in more depth, in Chapters 2 and 3. A relevant feature of 
resultant matrices in general, is their continuity with respect to small pertur- 
bations in the input coefficients. 

Our goal is to exploit the fact that systems encountered in engineer- 
ing applications are, more often than not, characterized by some structure. 
This claim shall be substantiated by examples in geometric modelling and 
computer-aided design as well as robotics; further applications exist in vision, 
and structural molecular biology (cf. [Emi97, EM99b]). A specific motivation 
comes from systems that must be repeatedly solved for different coefficients, 
in which case the resultant matrix can be computed exactly once. This oc- 
curs, for instance, in parallel robot calibration, see e.g. [DEO1c], where 10,000 
instances may have to be solved. 

This chapter is organized as follows. The next section describes briefly 
the main steps in the theory of toric elimination, which aspires to generalize 
the results and algorithms of its mature counterpart, classical elimination. 
Section 7.2 presents the construction of toric resultant matrices of Sylvester- 
type. The following section offers a method for implicitizing parametric (hy- 
per)surfaces, including the case of singular inputs, by means of perturbed 
toric resultants. Section 7.4 applies the tools of toric elimination for predict- 
ing the support of the implicit equation. The last section reduces solution of 
arbitrary algebraic systems to numerical linear algebra, thus yielding methods 
which avoid any issues of convergence. 

This chapter will be of particular interest to graduate students and re- 
searchers in theoretical computer science or applied mathematics wishing to 
combine discrete and algebraic geometry. Some basic knowledge of discrete 
geometry for polyhedral objects in arbitrary dimension is assumed. 

Previous work and open questions are mentioned in the corresponding 
sections. All algorithms discussed have been implemented either in Maple 
and/or in C, and are publicly available through the author’s webpage. Most 
are also available in the Maple library MULTIRES or the C++ library SYNAPS, 
both accessible on the Internet?. 


7.1 Toric elimination theory 


Toric elimination generalizes several results of classical elimination theory 
on multivariate polynomial systems of arbitrary degree by considering their 
structure. This leads to stronger algebraic and combinatorial results in general 


' nttp://www-sop.inria.fr/galaad/logiciels/ 


7 Toric resultants 271 


[CLO98, GKZ94, Stu94a, Stu02]. Assume that the number of variables is n; 
roots in (K*)" are called toric, where K is the algebraic closure of the coeffi- 
cient field. We use a° to denote the monomial (or power product) xf? +--+ 76, 


where e = (e€1,..-,€n) € Z”; note that we allow integer exponents. Let the 
input Laurent polynomials be 


Fisciaydn © BGT cca | (7.1) 
Let the support A; = {aj1,...,@im,} C Z” denote the set of exponent vectors 
corresponding to monomials in f; with nonzero coefficients: 


mi 


fi = y Cygne, for Ciz x 0. 


j=1 


The Newton polytope Q; C R” of f; is the convex hull of support A,;, in 
other words, the smallest convex polytope that includes all points in A;. This 
is a bounded subset of R”, of dimension up to n. Newton polytopes provide a 
bridge from algebra to geometry since they permit certain algebraic problems 
to be cast in geometric terms. For background information and algorithms on 
polytope theory, the reader may refer to [Ewa96, Sch93]. For arbitrary sets A 
and B CR”, their Minkowski sum is 


A+B={at+b|acA,be B}, 


where a+ } represents the vector sum of points in R”. For convex polytopes 
A and B, A+ B is a convex polytope. 


Definition 7.1.1. Given conver polytopes Aj,...,An,A,, C R”, the mixed 
volume MV(Aj,...,An) is the unique real-valued non-negative function, in- 
variant under permutations, such that, 


MV(Aj,..., Ax + pAj,,---;An) 
is equal to 
UMV(Aj,...,Ax,---; An) + PMV(A1,.-.,AR,--+;An)s 
for pt, ep € Rso. Moreover, we set 
MV(Aj,...,An) := n! Vol(A1), when Ay =--- = An, 
where Vol(-) denotes euclidean volume in R”. 


If the polytopes have integer vertices, their mixed volume takes integer values. 
Two equivalent definitions are the following. 
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Proposition 7.1.2. For \1,...,An € Rso and for convex polytopes Q1,...,Qn 
lying in R”, the mixed volume MV(Q1,...,Qn) is precisely the coefficient of 
AyA2 2979 An m 

Vol(A1Q1 + +++ + AnQn); 


when the latter is expanded as a polynomial in \1,...,An. Equivalently, 
V(Qi,--;Qn)= So (-1)P-"! vol (= a.) , 
FE{1, 0.570} tel 
In the last equality, J ranges over all subsets of {1,...,n}, so for n = 2 this 


gives MV(Q1, Q2) = Vol(Qi + Q2) _ Vol(Q1) = Vol(Q2). 


Exercise 7.1.3. Prove both formulae for the mixed volume from Proposi- 
tion 7.1.2, in the case n = 2, using Definition 7.1.1. You may start by proving 
that Vol(A1Q1 + A2@Qz2) lies in Z[A1, A2] and prove the first part of Proposi- 
tion 7.1.2. Then prove the second part of the proposition for n = 2. 


One may verify that mixed volume scales in the same way as the number 
of common roots of a well-constrained polynomial system with generic coeffi- 
cients. In particular, when some Newton polytope is expressed as a Minkowski 
sum, this means that the corresponding polynomial equals the product of two 
polynomials f;f/. So, the mixed volume can be written as a sum of mixed 
volumes, which corresponds to the fact that the generic number of common 
roots is given by a sum of root counts, each count corresponding to a system 
of polynomials including either f; or f/. 

Such properties were used by Kushnirenko in proving a restricted version 
of the following theorem, for the unmixed case [Kus75]. Then, Bernstein (also 
spelled Bernshtein) stated, in [Ber75], the now-famous generalization, also 
known as the Bernstein-Kushnirenko-Khovanskii (BKK) bound. We are now 
ready to state a slight generalization of this theorem. 


Theorem 7.1.4. Given system (7.1), the cardinality of common isolated zeros 
in (K’)", counting multiplicities, is bounded by MV(Qi,...,Qn), regardless 
of the dimension of the variety. Equality holds when a certain subset of the 
coefficients corresponding to the vertices of the Q;’s are generic. 


Newton polytopes provide a “sparse” counterpart of total degree. The 
same holds for mixed volume vis-a-vis Bézout’s bound, which is equal to the 
product of all total degrees. The two bounds coincide for completely dense 
polynomials, because each Newton polytope is an n-dimensional unit simplex 
scaled by deg f;. By definition, the mixed volume of the dense system is 


V((deg fi)S,.--, (deg fn)S “Tae MV(S = [Tees 


w=1 
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where S is the unit simplex in R” with vertex set {(0,...,0),(1,0,...,0),..., 
(0,...,0,1)}. 

There is an intermediate bound between the classical Bézout bound and 
mixed volume. It is called the m-homogeneous or, simply, m-Bézout bound, 
and holds for multihomogeneous polynomials. Suppose that the n variables 
are partitioned into r > 1 sets of n; variables each, for j = 1,...,r. Then, 
ny +++++n, =n. We may assume that there is a homogenizing variable for 
each variable subset 7 such that polynomial f; becomes homogeneous with 
respect to each subset, and has degree d,;; for i = 1,...,n and 7 = 1,...,r. 
Then, the m-Bézout number is given by 


? n r 
the coefficient of II a in polynomial II ye Pere 


j=l i=1 \j=1 


This number lies always between the classical Bézout bound and the mixed 
volume. For a general discussion see [MSW95]. 


Exercise 7.1.5 (combinatorial). If all d;; are equal to d; then recover the 
classical Bézout’s bound. Furthermore, show that the mixed volume of a sys- 
tem of multihomogeneous polynomials is given by the m-Bézout bound. For 
this, write every Newton polytope as Q; = >> j di;5;, where S; is the unit 
simplex in n; dimensions. 


Mixed volume is usually significantly smaller than Bézout’s bound for 
systems encountered in engineering applications. One example is the simple 
and generalized eigenproblems on k x k matrices. By adding an equation to 
ensure unit length of vectors, the Bézout bound in both cases is 2*+1, whereas 
the number of right eigenvector and eigenvalue pairs is 2k. This is precisely 
the mixed volume. We might, alternatively, employ the m-Bézout bound to 
the k x k system and obtain the exact count, namely k. 

It is possible to generalize the notion of mixed volume to that of stable 
mixed volume, thus extending the bound to affine roots [HS97b]. 

The mixed volume computation is tantamount to enumerating all mixed 
cells in a mixed (tight coherent) subdivision of Q; +---+ Qn. The term 
“decomposition” is also used in the literature, instead of “subdivision”. We 
express the operation of Minkowski addition on n polytopes as a many-to-one 
function from (R”)” onto R”: 


n 


(Qs +445Qn) > 3°: tC (Dijees Da) ype 
i=1 


i=1 


To define an inverse function, i.e., a unique tuple for every point in the sum, 
lifting is a standard geometric method. Select nm generic linear lifting forms 
l,:R” — R,i=1,...,n. Then define the lifted polytopes 


Q:={@i.4()): ne Q} CR, t=1,....0. 
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Now consider the Minkowski sum an sp ees om which is a convex polytope 
in R"*. The lower hull of this Minkowski sum is an n-dimensional (convex) 
polyhedral complex, i.e. a family of convex faces of varying dimensions that 
includes all subfaces, such that the intersection of any two faces is itself a face 
of both intersecting faces. The lower hull is defined with respect to the unit 
vector along the #,,41-axis: It is equal to the union of all n-dimensional faces, 
or facets, whose inner normal vector has positive last component. 

Each facet of )7"_, Q; can be written itself as a Minkowski sum ye EF, 
where every EF is a face of Qi, i=1,...,n. The genericity of the 1; ensures two 
things: First, that the lower hull Beoiectd bijectively onto the Minkowski sum 
yo, Q; of the original polytopes. Second, it guarantees tightness, which is 
the lofensl term for expressing the fact that every lower hull facet is a unique 
sum of faces F', so that +, dim F, equals the dimension of the facet, namely 
n. Note that for an arbitrary lifting we would have eer dim F, > n, but 
tightness means that equality holds. 

The subdivision of the lower hull into faces of dimensions from 0 to n 
induces a subdivision of the Minkowski sum )>""_, Q; into cells of respective 
dimensions. Such a subdivision is called regular and is defined by projecting 
each lower-hull face onto one cell. In particular facets, whose dimension is n, 
are projected onto n-dimensional (hence, maximal) cells. Furthermore, each 
(maximal) cell o is expressed as the Minkowski sum of faces from the Q;: 
Each Minkowski sum 

g=F,+-:-4+F, 


is unique, where each F; is a face of Q;, so that a dim F; = dimo. Each F; 
corresponds to F, that appears in the unique sum defining the corresponding 
lower-hull facet that projects onto o. This sum is said to be optimal since it 
minimizes the aggregate lifting function over the given cell. 

The regularity of the subdivision implies its coherence, i.e., a continuous 
change of the optimal expressions of every cell o as a sum of faces. This cell 
complex is, therefore, a tight coherent mized subdivision. We define the mixed 
cells to be precisely those where all summand faces are one-dimensional. 


Proposition 7.1.6. The mized volume equals the sum of the volumes of all 
mixed cells in the mixed subdivision. 


Example 7.1.7. Consider the system 
2 
fi = C10 + C11 4182 + C127 %2 + €13%1, fo = C20 + Ca1%2Q + C2241 %2Q + C2321. 


These polynomials have Newton polytopes and Minkowski sum as shown in 
Figure 7.1. The shown subdivision is achieved with 1, = —a, — 2%2,l2 = 
4a y+. 

It is clear that the mixed volume equals 3, which is the exact number of 
common roots for two generic polynomials with these supports. However, the 
system’s Bézout number equals 4. 
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Fig. 7.1. The Newton polytopes and mixed subdivision in Example 7.1.7. 


In the sequel, we shall see more examples of mixed subdivision. Some of 
the simplest instances appear in Examples 7.2.2 and 7.4.5. 


Exercise 7.1.8. Compute the mixed volume of 


A, = {(0,0), (1.0), (2,0)}, Az = {(0,0), (0, 1), (0, 2)}. 


Can you find a linear lifting that yields a single mixed cell, so that the mixed 
volume equals the volume of a single cell? 


In terms of complexity classes, the computation of mixed volume is #P- 
complete. This computation identifies the integer points comprising a mono- 
mial basis of the quotient ring of the ideal defined by the input polynomials. 
Mixed, or stable mixed, cells also correspond to start systems (of binomial 
equations, hence with an immediate solution) for a toric homotopy to the 
original system’s roots. Such issues go beyond the scope of this chapter; see 
Chapter 8 or [GLW99, Li97, VG95]. 


7.1.1 The toric resultant 


For a more general introduction to resultants, one may consult Sections 1.3 
and 1.6 of Chapter 1, Section 2.3 of Chapter 2, or Chapter 3. The resultant of 
a polynomial system of n + 1 polynomials with indeterminate coefficients in 
n variables is a polynomial in these indeterminates, whose vanishing provides 
a necessary and sufficient condition for the existence of common roots of the 
system. Simple examples and a formal definition follow. 

The resultant can be expressed by Poisson’s formula, namely C' [], fo(a), 
where fo is one of the polynomials, evaluated at all common roots a of the 
other n equations, and C is a function of the coefficients of these n polynomi- 
als. It is then easy to see that the resultant is homogeneous in the coefficients 
of each polynomial. 

The history of resultants (and elimination theory) includes such luminaries 
as Euler, Bézout, Cayley, and Macaulay. Different resultants exist depending 
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on the space of the roots we wish to characterize, namely projective, affine, 
toric or residual [BEM01, CLO98, EM99b, Stu02]. Projective resultants (also 
known as classical) were historically the first to be studied and characterize 
the existence of projective roots. We shall focus on toric resultants below. 
Residual resultants were more recently introduced in order to study roots in 
the difference of two varieties. 


Example 7.1.9. The bilinear system f; = cio +¢;1%1+Cj2%2+¢)30122, 1 = 0,1,2 
is used in modelling a bilinear surface in R? as the set of values (fo, f1, fo) € 
R°; see Figure 7.2. 


Fig. 7.2. A bilinear surface patch. 


The bivariate system of the f;’s has toric resultant equal to 


Coo Co1 Co2 Co3 :O O 
C10 C11 C12. C13.:*O OO 
C29 C21 C22 C23 0 O 
Res = det | ©20 C21 ©22 23 
0 coo 0 Coz Co1 Co3 
0 cio O C12 C11 C13 


0 coo O C22 C21 C23 


assuming the matrix [c;;];,;>0 is regular. Notice that the first three matrix rows 
correspond to the input polynomials, whereas the last three rows correspond 
to the same polynomials multiplied by 2,. This determinant has degree 2 per 
polynomial, which is precisely the mixed volume of two input polynomials; 
remark that this is the generic number of roots. Hence the determinant equals 
the toric resultant. 

In the following sections, we shall discuss ways to construct this matrix 
and, ultimately, the resultant. Two alternative ways are presented in Chap- 
ter I. 

If our only tool were the projective (classical) resultant, one would consider 
3 bivariate polynomials, each of total degree 2. The resultant has degree 4 per 
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polynomial, hence 12 in total in the c;;’s. For the bilinear system, certain 
coefficients must be specialized to zero. One can show that the projective 
(classical) resultant vanishes identically in this case. 


The simplest case, where the classical projective and toric resultants co- 
incide, is that of a linear system of n + 1 equations in n variables. The de- 
terminant of the coefficient matrix is the system’s resultant and, under the 
assumption on the non-vanishing of certain minors, it becomes zero exactly 
when there is a common root. Due to the linearity of the equations, this root 
is then unique. 


Exercise 7.1.10. Using linear algebra, prove that the resultant of a linear 
system vanishes precisely when there exists a unique common root, provided 
that certain minors are nonzero. Moreover, apply Cramer’s rule in order to 
compute each coordinate of this root as a ratio of determinants. 


The question of whether two polynomials f;(«), fo(a) € K[a] have a com- 
mon root leads to a condition that has to be satisfied by the coefficients of 
both polynomials; again classical and toric resultants coincide. The system’s 
Sylvester matrix is of dimension deg f; + deg fo and its determinant is the sys- 
tem’s resultant, provided the leading coefficients are nonzero. This matrix rows 
contain the coefficient vectors of polynomials x* f;, for k = 0,...,deg fj — 1 
and {i,j} = {1,2}. 

Bézout developed a method for computing the resultant as a determinant 
of a matrix of dimension equal to max{deg f, deg fo}. Its construction goes 
beyond the scope of this chapter; the reader may refer to Chapters 1 and 3. 

For an illustration, consider fj = aq," +++++ a9, fo = ba, v® +--+ bo, 
with all coefficients nonzero. Their resultant is the determinant of the Sylvester 
matrix, namely 


Gd, Gd,-1 **: ao O 0 
O Gd, Gdy-1°*: a O 0 
0 Qd, Ad,-1 "°° ao 
big Bazi by- “C450 
O bag bdg-1 °°* bop O O 
0 bay Ddy—1 °** bo 


The interested reader may refer to Section 1.3 of Chapter 1 for a more detailed 
discussion on resultants of univariate polynomials. 


Exercise 7.1.11. Using the greatest common divisor of f1, fo prove that the 
resultant of these two polynomials vanishes precisely when they have a com- 
mon root. Can you compute the coordinates of this root from the kernel 
vectors of the Sylvester matrix? 
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Toric resultants express the existence of toric roots. Formally, 


foc de EB jeu a |; (7.2) 


f, corresponding to generic point c; = (Ci1,..-,Cim;) in the space of polyno- 
mials with support A;. This space is identified with projective space pee, 
Then system (7.2) can be thought of as point c = (cp,...,¢n). Let Z denote 
the Zariski closure, in the product of projective spaces, of the set of all c such 
that the system has a solution in (K )". Note that Z is an irreducible variety. 

A technical assumption is that, without loss of generality, the affine lattice 
generated by aes A; is n-dimensional. This lattice is identified with Z” 
possibly after a change of variables, which can be implemented by computing 
the appropriate Smith’s Normal form [Stu94a]. 


Definition 7.1.12. The toric (or sparse) resultant Res = Res(Ao,..., An) of 
system (7.2) is a polynomial in Z[c|. If codim(Z) = 1 then Res is the defining 
irreducible polynomial of the hypersurface Z. If codim(Z) > 1 then Res = 1. 


An additional assumption we make is that the family Aop,...,An is es- 
sential. This means that, for every proper index subset J C {0,...,n} with 
cardinality |I|, the following holds for the dimension of certain Minkowski 


sums: 
dim b 4 > |II. 
ier 

Essential support families are also discussed in Section 1.6 of Chapter 1. 

Then, the toric resultant Res(Ap,..., An) is homogeneous in the coeffi- 
cients of f; with deg, Res(A;) = MV_;. The vanishing of Res(Ao,..., An) is 
a necessary and sufficient condition for the existence of roots in the projec- 
tive toric variety X , corresponding to the Minkowski sum of the n+ 1 Newton 
polytopes. A projective toric variety is the closure of the image of the following 
map of the torus: 


(C*)” > P™ str (402 --.  tPm), 


where the b; € Z"” are the vertices of the Minkowski sum. If all Newton 
polytopes are identical, then these are simply the vertices of the Newton 
polytope. For instance, when this polytope is the unit simplex, the toric re- 
sultant coincides with P”. In the case of bilinear systems (see Example 7.1.9), 
X = P! x P!. Toric varieties are also discussed in Chapter 3 as well as 
in [Cox95, GKZ94, KSZ92). 

Some fundamental properties of the toric resultant are as follows. 


e The toric resultant subsumes the classical resultant in the sense that they 
coincide if the polynomials are dense. 

e Just as in the classical case, when all coefficients are generic, the resultant 
is irreducible. 
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e While the classical resultant is invariant under linear transformations of 
the variables, the toric resultant is invariant under transformations that 
preserve the polynomial support. 

e In the case of non-generic coefficients, certain divisibility properties hold. 
In particular, when a system of polynomials lies in the ideal generated 
by another system, then the latter resultant is divisible by the former 
resultant. 


7.2 Matrix formulae 


Different means of expressing each resultant are possible, distinguished into 
Sylvester, Bézout and hybrid-type formulae [BEM01, CLO98, DE03, EM99b, 
Stu02]. Ideally, we wish to express it as a matrix determinant, a quotient 
of two determinants, or a divisor of a determinant where the quotient is a 
nontrivial extraneous factor. This section discusses matrix formulae for the 
toric resultant known as toric resultant matrices. 

We restrict ourselves to Sylvester-type matrices; such matrices for the toric 
resultant are also known as Newton matrices because they depend on the in- 
put Newton polytopes. Sylvester-type matrices generalize the coefficient ma- 
trix of a linear system and Macaulay’s matrix. The latter extends Sylvester’s 
construction to arbitrary systems of homogeneous polynomials, and its de- 
terminant is a nontrivial multiple of the projective resultant. Other types of 
resultant matrices are discussed in Chapter 3. 

The transpose of a Sylvester-type matrix corresponds to the following 
linear transformation: 


i=0 


where the support of each polynomial g; is related to the matrix. If we ex- 
pressed the g;’s in the monomial basis, then (go,...,9n) would be a vector 
that multiplies from the left the transposed matrix (or from the right, the 
resultant matrix itself). The support of each g; is the set of monomials multi- 
plying f; in order to define the rows that correspond to f;. These rows contain 
shifted copies of the f; coefficients. The shift is performed in such a way so 
as to obtain g;f; as the product of g;-block of the vector, multiplied by the 
block of rows corresponding to f;. The reader should consult the examples of 
resultant matrices given above as well as in the sequel. 

Overall, each row expresses the product of a monomial with an input 
polynomial; its entries are coefficients of that product, each corresponding to 
the monomial indexing the corresponding column. The degree of det M in the 
coefficients of f; equals the number of rows with coefficients of f;. This must be 
greater than or equal to deg, Res. It is possible to pick any one polynomial 
so that there is an optimal number of rows containing its coefficients; this 
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number is obviously deg, Res. This is true both in the case of Macaulay’s 
matrix and in the case of the Newton matrix constructions below. 


7.2.1 Subdivision-based construction 


There are two main approaches to construct a well-defined, square, gener- 
ically nonsingular matrix M, such that Res |det M. The second algorithm 
is incremental and shall be presented later. The first approach (cf. [CE93, 
CE00, CP93, Stu94a]), relies on a mixed (tight coherent) subdivision of the 
Minkowski sum 


Q=Qot-+Qn; 


which generalizes the discussion of Section 7.1. It uses n + 1 generic linear 
lifting forms 1; : R” — R to define the lifted polytopes. Maximal cells in the 
subdivision are written uniquely as 0 = Fy +---+ Fy, where F; C Q; and 
>>, dim F; = n. Therefore, at least one face is a vertex. The mized cells are 
precisely those where all other summand faces are one-dimensional. If this is 
a vertex from Q;, then the cell is said to be i-mixed. 

It can been shown [Emi96] that the i-mixed cells are the same as the mixed 
cells in the mixed subdivision the n Newton polytopes Qo,.--, Qi-1, Qi41,---; 
Qn, provided that we use the same lifting functions in both cases. A direct 
consequence is that the mixed volume of fo,..., fi-1, fizi,---, fn is given by 
the sum of volumes of all i-mixed cells, thus extending Proposition 7.1.6. 

The matrix construction algorithm uses a subset of (Q+6)NZ” to index 
the rows and columns of resultant matrix M, where 6 € R” is an arbitrarily 
small and sufficiently generic vector. This vector must perturb all integer 
points indexing some row (or column) of the matrix in the strict interior of a 
maximal cell. It can be chosen randomly and the validity of our choice can be 
confirmed by the matrix construction algorithm. The probability of error for 
a vector with uniformly distributed entries is bounded in [CE00]. 

Now consider an integer point p, such that p+6 lies in an arbitrary maximal 
cell o. The algorithm associates to p the pair (7,7) if and only if a;; € Q; is 
a vertex in the optimal sum of o and 7 is the maximum index of any vertex 
summand. The row of M corresponding to p shall contain the coefficients of 
polynomial 

gP Ms fi- 


The entries corresponding to column monomials that do not explicitly appear 
in the row polynomial are set to zero. If o is i-mixed, then a,; is the unique 
vertex summand. For non-mixed cells, the Minkowski sum has more than one 
vertices, and the above rule defines a matrix with the minimum number of 
rows with fo, because in these cases it shall avoid the 0 index. 

Therefore, the number of fp rows equals the number of integer points in 
0-mixed cells, which equals 


MV(fi, cea) Fn) = deg ;, Res( Aj). 
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As for the number of f; rows, for 7 > 0, this is larger or equal to the number of 
integer points in i-mixed cells. The above argument tells us that this is at least 
as large as deg , Res. Now recall that the degree of the matrix determinant in 
the coefficients of f; equals the number of its rows containing shifted copies of 
the coefficient vector of f;. The algorithm may use an analogous rule to avoid 
index 7 if we wish the matrix to have the minimum number of rows containing 
fi, for i> 0. 

It can be proven that every principal minor of matrix M, including its de- 
terminant, is nonzero when the polynomials have generic coefficients [CE00]. 
The proof of this theorem uses an adequate specialization of the input coeffi- 
cients, in terms of a new parameter t. In particular, the coefficient in f; that 
multiplies the monomial x% is specialized to t!é*7), where 1; is the lifting ap- 
plied to Q;. Then, each row of the specialized matrix, indexed by some point 
p, is multiplied by the power phe), Here, fh denotes the vertical distance 
of p € R” to the lower hull of }7;..) Q; and we have assumed that p has been 
associated to the pair (k,s). The last step in the proof establishes that the 
product of all diagonal entries in the new matrix equals the trailing term of 
its determinant with respect to t. 

Moreover, it is not so hard to show that the determinant of M vanishes 
whenever Res = 0. We thus arrive at the following theorem. 


Theorem 7.2.1 ([CE93, CE00]). We are given an overconstrained system 
with fixed supports. With the above notation, matrix M is well-defined and 
square. Its determinant is generically nonzero and divisible by the toric resul- 
tant Res. 


Example 7.2.2. Let us apply the subdivision-based algorithm to construct 
Sylvester’s matrix. Take 


2 
fo = Coo + €o1%, fir = Cro + C11 + C122". 


There are two possible subdivisions obtained with linear liftings; one is shown 
in Figure 7.3, along with the 6 perturbation. 

For illustration, we note that the algorithm associates to point 2 the 
pair (1,2), i.e. the matrix row indexed by zx? shall contain the coefficients 
of x?-? f, = f,. A similar argument builds the other rows of the matrix. The 
reader may check that this is indeed the well-known Sylvester matrix. 


Example 7.2.3. For n = 2, let us apply the subdivision-based algorithm in the 
case of linear polynomials. Take 


fi = Cio + Ci1%1 + Ci2%2, 1 = 0,1, 2. 


One possible linear lifting induces the subdivision in Figure 7.4. The same 
figure shows the perturbation of choice, so that we recover the matrix of the 
system’s coefficients, as expected. In fact, any vector 6 € Ryo would do. 
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bis 
—e__e_e__e 
0+Q1 Q0 +2 


Fig. 7.3. The Minkowski sum of the lifted Newton segments and the induced sub- 
division in Example 7.2.2. 


Then, there are three integer points in the perturbed Minkowski sum, 
namely (1,2),(1,1), and (2,1). They are associated, respectively, to pairs 
[2,(0,1)}, [1, (0,0)] and [0, (1,0)]. For instance, the row indexed by x23 shall 
contain polynomial a?)—O1 f, = a fy, 


QO +(0,1) + (0,1) 


75 
—| +(0,1) 


——+(0,0) + 


- 


Fig. 7.4. The mixed subdivision and the perturbation with respect to the original 
Minkowski sum. 


The resultant matrix is therefore 


Co1 Co2 C03 
M=)ceu C2 C13], 
C21 C22 C23 


with rows corresponding to the polynomials 2,22 f; and columns indexed by 
2 2 
U1%Q, GX, V1 XQ. 


There is a greedy variant from [CP93] of the subdivision-based algorithm. 
It starts with a single row, corresponding to some integer point, and proceeds 
iteratively by adding new rows (and columns) as need be. For a given set 
of rows, the column set comprises all columns required to express the row 
polynomials. For a given set of columns, the rows are updated to correspond to 
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the same set. The algorithm continues by adding rows and the corresponding 
columns until a square matrix has been obtained. 


Example 7.2.4. Consider a system of 3 polynomials in 2 unknowns: 


2 
fo = Cor + Corry + Co3x*y + Co4®, 


fi = C11y + C207 y” + c1307y + Cj4x, 


fo = ca1 + Cony + Cogry + Coe. 


419 


492 403 a9 473 


a2] 424 
401 404 a4 


Fig. 7.5. The supports and Newton polytopes in Example 7.2.4. 


The Newton polytopes are shown in Figure 7.5. The mixed volumes are 
MV(Qo, Q1) = 4, MV(Q1, Q2) = 4, MV(Qa2, Qo) = 3, so the toric resultant’s 
total degree is 11. Compare this with the Bézout numbers of these subsystems: 
8,6, 12; hence the projective resultant’s total degree is 26. 

Assume that the lifting functions are Io(a, y) = La+L?y,li(x,y) = -L?a2—- 
y, lo(a, y) = «— Ly, where L > 1. The lifted Newton polytopes and the lower 
hull of their Minkowski sum is shown below. These functions are sufficiently 
generic since they define a mixed subdivision where every cell is uniquely 
defined as the Minkowski sum of faces F; C Qj. 
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The lower hull of the Minkowski sum of the lifted Q;’s is then projected to 
the plane, yielding generically a mixed subdivision of Q. Figure 7.6 shows Q+6 
and the integer points it contains; notice that every point belongs to a unique 
maximal cell. Every maximal cell a is labeled by the indices of the Q; vertex 
or vertices appearing in the unique Minkowski sum 0 = Fo + ---+ Fy, with 
ij denoting vertex a;; € Q;. For instance, point (1,0) belongs to a maximal 
cell o = an, + F + F’, where F, F" are the edges (a14, 413) C Qi and (a1, a24) 
respectively. The corresponding row in the matrix will be filled in with the 
coefficient vector of x fo. 


Fig. 7.6. A mixed subdivision of Q perturbed by (—3/8,—1/8), in Example 7.2.4. 


The Newton matrix M appears below with rows and columns indexed by 
the integer points in the perturbed Minkowski sum. M contains, by construc- 
tion, the minimum number of fp rows, namely 4. The total number of rows is 
4+4+7=15,ie., the determinant degree is higher than optimal by 1 and 
3, respectively, in the coefficients of f; and fo. 


WWWWNNNNNRFR RFF OO 


PWM BRWNRrRFOWNFON EH 
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LO 20 04 14d 2 S02. ht 89.3943: 18 3a 33 43 


Col Co4 0 0 Co2 Co3 0 0 0 0 0 0 0 0 0 

C21 C24 0 C22 C23 0 0 0 0 0 0 0) 0) 0 0 
0 0 Col Co4 0 0 0 Co2 Co3 0 0 0 0 0) 0 
0 0 0 Col Coa 0 0 0 Co2 Co3 0 0 0 0 0 

C14 0 C11 0 C13 0 0 0 C12 0 0 0) 0 0 0 
0 C14 0 C11 0 C13 0 0 0 C12 0 0 0 0 0 
0 0 C21 C24 0 0 C22 C23 0 0 0 0 0 0 0 
0 0 0 C21 C24 0 0 C22 C23 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 Col Co4 0 0 0 Co2 C03 
0 0 0 0 C21 C24 0 0 C22 C23 0 0 0 0 0 
0 0 0 0 0 ca —O O cn O c3 O 0 0 c12 
0 0 0 0 0 0 0 C21. C24 0 0 C22. C23 0 0 
0 0 0 C14 0 0 C11 0 C13 0 0 0 C12 0 0 
0 0 0 0 0 0 0 0 C21 C24 0 0 C22 C23 0 
0 0 0 0 0 0 0 0 0 C21 C24 0 0 C22 C23 


The greedy version produces a matrix with dimension 14 which can be 
obtained by deleting the row and the column corresponding to point (1,3). 


The subdivision-based approach can be coupled with the existence of a mi- 
nor in the Newton matrix that divides the determinant so as to yield the exact 
toric resultant [D’A02]. D’Andrea has proposed a recursive lifting procedure 
that gives a much lower value to a chosen vertex of Qo. The cells whose optimal 
sum does not contain this vertex are then further subdivided by assigning this 
special role to a vertex of Q;, and so on. This generalizes Macaulay’s famous 
quotient formula that yields the exact projective resultant [Mac02]. 

The existence of a non-recursive algorithm, relying on a single lifting, is 
still open in the general case. It is, nonetheless, possible for n = 2 and for 
families of sufficiently different Newton polytopes. A glimpse of what this 
lifting may look like is offered by the hybrid matrix constructed in [DE01b]. 


Example 7.2.5 (Continued from Example 7.1.9). The bilinear system f; = 
Cio tt Ci1l1 + C2024 C337%122, t= 0, di; 2, despite its apparent simplicity, does not 
admit an optimal toric resultant matrix, when we apply the subdivision-based 
algorithm. In contrast, the greedy variant may yield an optimal matrix and the 
incremental algorithm of the next section produces the optimal 6 x 6 matrix 
in Example 7.1.9. It is possible to construct the following 9 x 9 numerator 
matrix, using the subdivision-based algorithm: 
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Coo Cor Coz Cop O O ONO O O fo 
Cio Cir Ci2 C13: O 1-0 0 0 0 fi 
C290 €21 C22 C23 O 0 0 DO 0 fr 
0 0 0 coo C01 Coz 0 OD C€o3 ©1 22 fo 
M= 0 C10 0 C12 C13 0 C11 0 0 rift 
0 O C29 Cor O C23 0 Co2 O rofe 
0 C20 0 C22 C23 0 C21 0 0 r1f2 
0 0 C10 €11 0 C13 0 C12 0 tof 
0 0 0 cio ci ci2 0 O cig 1X2 fh 


The choice was 6 = (4,5) and the lifting is such that one vertex of the 


first polytope has an infinitesimal lifting value compared to the other val- 
ues. It is now possible to define a denominator matrix M’, of dimension 3, 
which is a submatrix of MW. It is defined by the rows indexed by polynomi- 
als fi, fo,%1%2f; and the respective columns; these correspond precisely to 
the integer points in non-mixed cells. The ratio of the determinants yields 
precisely the toric resultant. 


7.2.2 Incremental construction 


The second algorithm [EC95], is incremental and yields usually smaller ma- 
trices and, in any case, no larger than those of the subdivision algorithm. The 
flexibility of the construction makes it suitable for overconstrained systems. 
On the downside, there exists a randomized step so certain properties of the 
subdivision-based construction cannot be guaranteed a priori. 

The selection of integer points, which correspond to monomials multiply- 
ing the row polynomials, uses a vector v € (Q*)". The goal is to choose an 
adequate subset of integer points in 


Qu > 5; t= Oyen ym. 


J=0,5A% 


This is achieved by first sorting all points p € Q_; Z” according to their 
distance, along v, from the boundary. This distance is defined as follows, for 
point p: 

v-distance(p) := max{s € Rso : p+su € Q_4}. 


The construction is incremental, in the sense that successively larger point 
sets are considered by decreasing the lower bound on the v-distance of the 
set’s points. For given point sets, a candidate matrix is defined. If the number 
of rows is at least as large as the number of columns and it has full rank for 
generic coefficients, then the algorithm terminates and returns a nonsingular 
maximal square submatrix. The determinant of this submatrix is a nontrivial 
multiple of the toric resultant; otherwise, new rows (and columns) are added 
to the candidate. 
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In those cases where a minimum matrix of Sylvester type provably exists 
(SZ94, WZ94], the incremental algorithm produces this matrix. For general 
multi-homogeneous systems, the best vector is obtained in [DE03]. These are 
precisely the systems for which v can be deterministically specified; otherwise, 
arandom v can be used. Different choices can be tried out so that the smallest 
matrix may be chosen. 


Example 7.2.6 (Continued from Example 7.2.4). Figure 7.7 shows Q_o in bold 
and randomly chosen vector v = (20,11). The different point subsets in Q_o 
with respect to v-distance are shown by the thin-line polygons. In fact, the 
thin lines represent contours of fixed v-distance. The final point set from Q_o 
is the following, shown with the respective v-distances: 

{(0, 1; 3/20), (1,0; 1/10), (1,1; 1/10), (1,2;1/11)}. 


Fig. 7.7. Q_-o subsets with different v-distance bounds and vector v. 


This v leads to a 13 x 12 nonsingular matrix M shown below. Deleting the 
last row defines the 12 x 12 resultant submatrix. 


12 22 04 1 Oa 3 1 20 Bo oF aS 


0, 1 Co2 Co3 Col Co4 0 0 0 0 0 0 0 0 
1, 0 0 0 0 0 Co2 Co3 Col Co4 0 0 0 0 
1, 1 0 Co2 0 Col Co4 0 0 0 Co3 0 0 0 
1, 2 Col Co4 0 0 0 0 0 0 0 Co2 C03 0 
0, 0 0 C12 C11 0 C13 0 C14 0 0 0 0 0 
1, 0 0 0 0 C11 0 C13 0 C14 C12 0 0 0 
1, 1 C11 0 0 0 C14 0 0 0 C13 0 C12 0 
0, 1 0 C13 0 C14 0 0 0 0 0 C42 0 Ci1 
0, 1 C23 0 C21 C24 0 0 0 0 0 0 0 C22 
1, 1 C22 C23 0 C21 C24 0 0 0 0 0 0 0 
1, 0 0 0 0 C22 C23 0 C21 C24 0 0 0 0 
2, 1 0 C22 0 0 C21 C24 0 0 C23 0 0 0 
2; 2 0 C21 0 0 0 0 0 0 C24 C22 C23 0 


Other techniques to reduce matrix size (and mixed volumes) include the 
introduction of new variables to express subexpressions which are common to 
several input polynomials. For an illustration, see [Emi97]. 

Clearly, mixed volume captures the inherent complexity of algebraic prob- 
lems in the context of sparse elimination and thus provides lower bounds 
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on the complexity of algorithms. On the other hand, several toric elim- 
ination algorithms rely on Minkowski sums of Newton polytopes. There- 
fore, a crucial question in deriving output-sensitive upper bounds is the 
relation between mixed volume and the volume of these Minkowski sums. 
In manipulating mixed volumes, some fundamental results can be found 
in [Sch93]. In particular, the Aleksandrov-Fenchel inequality leads to the fol- 
lowing bound [Emi96, Lut86]: 


MV"(Qi,---,; Qn) = (n!)"Vol(Q1) +++ Vol(Q,,). 


For a system of Newton polytopes Q;, define its scaling factor s to be the 
minimum real value so that Q; + t; C sQ, for all Q;, where Q,, is the poly- 
tope of minimum euclidean volume and the ¢t; € R” are arbitrary translation 
vectors. Clearly, s > 1 and s is finite if and only if all polytopes have an affine 
span of the same dimension. Let e denote the basis of natural logarithms, and 
suppose that the volumes Vol(Q;) > 0 for all 7. Then, for a well-constrained 
system, we have 


Vol bs a.) = O(e"s")MV(Q1,---,Qn), 
w=1 


whereas for an overconstrained system the same techniques yield 


n es” n 
Vol (>: a.) =O ( . ) 2s 


where MV_; = MV(Qo, seey Qi-1, Qi+1; seey Qn) [Emi96]. 

As a consequence, the asymptotic bit complexity of both subdivision-based 
and incremental algorithms is singly exponential in n, proportional to the total 
degree of the toric resultant, and polynomial in the number of Q; vertices, 
provided all MV_; > 0. 

Newton matrices, including the candidates constructed by the incremen- 
tal algorithm, are characterized by a structure that generalizes the Toeplitz 
structure and has been called quasi-Toeplitz [EP02] (cf. [CKL89]). By ex- 
ploiting this structure, determinant evaluation has quasi-quadratic arithmetic 
complexity and quasi-linear space complexity in the matrix dimension (here 
“quasi” means that polylogarithmic factors are ignored). The efficient imple- 
mentation of this structure is open today and is important for the competi- 
tiveness of the entire approach. 


7.3 Implicitization with base points 


The problem of switching from a rational parametric representation to an 
implicit, or algebraic, representation of a curve, surface, or hypersurface lies 
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at the heart of several algorithms in computer-aided design and geometric 
modelling. Given are rational parametric expressions 


a; = pi(t)/q(t) € K(t) = K(ti,...,tn), 1=0,...,n, 


over some field K of characteristic zero. The implicitization problem consists 
in computing the smallest algebraic hypersurface in terms of x = (a,...,%n) 
containing the closure of the image of the parametric map t + x. The most 
common case is for curve and surface implicitization, namely when n = 1 and 
n = 2 respectively. Resultants offer an efficient approach for this problem, but 
face certain questions due to degeneracy conditions, discussed below. Several 
other algorithms exist for this problem, including methods based on Grobner 
bases, moving surfaces, and residues. Their enumeration goes beyond the scope 
of this chapter; cf. also, Chapter 3. 

Implicitization is equivalent to eliminating all parameters t from the poly- 
nomial system 

F(t) = pit) — aiq(t), 1=0,...,n, 


regarded as polynomials in t. The resultant is well-defined for this system, 
and shall be a polynomial in x, equal to the implicit expression, provided 
that it does not vanish and the parametrization is generically one-to-one. 
Otherwise, the resultant is a power of the implicit equation. More subtle is 
the case where the resultant is identically zero. This happens precisely when 
there exist values of t, known as base points, for which the f; vanish for all 
x;; in other words, the p;(t) and q(t) evaluate to zero. Base points forming 
a component of codimension 1 can be easily removed by canceling common 
factors in the numerator and denominator of the rational expressions for the 
x,’s. But higher codimension presents a harder problem. 

Besides cases where the (toric) resultant vanishes, another problem with 
non-generic coefficients is that the resultant matrix may be identically singu- 
lar. We understand that avoiding degeneracies is an important problem, whose 
relevance extends beyond the question of implicitization with base points. 
In [DEOla], a toric (sparse) projection operator is defined by perturbing the 
subdivision-based matrix such that, after specialization, this operator is not 
identically zero but vanishes on roots in the proper components of the variety, 
including all isolated roots. 

This is a standard idea in handling degeneracies in the case of resultants. 
In the classical context, Canny [Can90] perturbed each f; by adding a new 
factor en, where i = l,...,n, and fo by adding ¢, where € is a positive 
infinitesimal indeterminate. Rojas proposed a perturbation scheme for toric 
resultants in [Roj99a] which yields a perturbed resultant of low degree in € but 
is, nonetheless, rather expensive to compute. Our scheme generalizes [Can90] 
and requires virtually no extra computation besides the matrix construction. 

Suppose we have a family p := (po(x)...,Pn(#)) of Laurent polynomials 
such that supp(p;) C A;, and Res(po,..-,Pn) # 0. The Toric Generalized 
Characteristic Polynomial (p-GCP) is 
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Cp(€) == Res (fo — €Po,---; fn — Pn) - 


Let Cp.n(Y1,--+,Ym) be the coefficient of C,(€) of lowest degree in €, namely 
k. The coefficient Cp,, is a suitable projection operator. In fact, the polyno- 
mials p; may have random coefficients and support including precisely those 
monomials of f; which appear on the diagonal of the toric resultant matrix. 
The perturbation has been implemented in Maple; see also Section 7.5. 


Example 7.3.1 (Continued from Example 7.2.4). In the special case 


2 
foHltajtgt ap rgt+ a, fo=l+agt+a, 22+ 7%, 


the toric resultant vanishes for all c,; since the variety V(fo, fi) has positive 
dimension: it is formed by the union of the isolated point (1,—1) and the line 
{—1}xC. For a specific lifting and matrix construction, the trailing coefficient 
in the perturbed determinant is that of e? and equals 


—(c12¢13) (C14 — C11 + C12 — C13) (C14 + C11 — C12 + €13). 


So we can recover in the last two factors the value of f; at the isolated zero 
(1,1) and the point (—1, —1) in the positive-dimensional component. 


The next example illustrates the perturbation method in applying toric 
resultants for system solving. 


Example 7.3.2. This is the example of [Roj99a]. To the system 


fi = 1420-227 y—S5aeyt+a74+3a%y, fo = 24+6x—627y—llay+4274+5ry, 


we add fp := uyx+ugy+ug, which does not have to be perturbed. We use the 
function spresultant from Maple library MULTIRES to construct a 16 x 16 
matrix M in parameters uo, U;, U2,€. The number of rows per polynomial are, 
respectively, 4,6,6, whereas the mixed volumes of the 2 x 2 subsystems are all 
equal to 4. Here is the Maple code for these operations, where e stands for e: 


M := spresultant ([f0,f1,f2], [x,y]): 


DM := det(M): # in u0,ul,u2,e 
degree (DM,e); # outputs 12 
ldg := ldegree(DM,e) ; # outputs 1 
phi := primpart (coeff (DM,e,ldg)): 
factor (phi) ; 
For certain w and 6, we have used p, := —3x? + xy, po := 2+52?. The 


perturbed determinant has maximum and minimum degree in ¢, respectively, 
12 and 1. The trailing coefficient gives two factors corresponding to isolated 
solutions (1/7,7/4) and (1,1): (49 ug +4 u; +28 ug) (ug + uz + ug). Another 
two factors give points on the line {—1} x C of solutions, but the specific 
points are very sensitive to the choice of w and 6. One such choice yields: 
(—uo + uz) (27 ug +40 uz — 40 ug). 
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Example 7.8.8. In the robot motion planning implementation of Canny’s 
roadmap algorithm in [HPOO0], numerous “degenerate” systems are encoun- 
tered. Let us examine a 3 x 3 system, where we hide x9 to obtain dense 
polynomials of degrees 3, 2, 1: 


fo= 54a4° — 21.6212x9 — 69.122) 292 + 41.472293 + (50.625 + 75.4529) x? 
+ (—92.25 + 32.8829) 1122 + (—74.592z9 + 41.4) ro?+ 

+(131.25 + 19.0429? — 1682x9)a1 + (—405 + 25.728297 + 126.429) a+ 
+(—108.8 xo? + 3.75 xo + 234.375), 

fi, = —37.725 217 — 16.44 2122 + 37.296 xq? + (—38.0829 + 84) 21+ 

+ (—63.2 — 51.4562) ro + (2.304297 + 217.629 — 301.875), 

fe = 152 = 1222 + 16 xo. 


The Maple function spresultant applies an optimal perturbation to an iden- 
tically singular 14 x 14 matrix in xo. Now det M(e) is of degree 14 and the 
trailing coefficient of degree 2, which provides a bound on the number of affine 
roots. We obtain 


L434) (12815703325 
625 


$(%0) = (x ~ 625° 21336 


the first solution corresponding to the unique isolated solution but the second 
one is superfluous, hence the variety has dimension zero and degree 1. 


Our perturbation method applies directly, since the projection operator 
will contain, as an irreducible factor, the implicit equation. The extraneous 
factor has to be removed by factorization. Distinguishing the implicit equa- 
tion from the latter is straightforward by using the parametric expressions to 
generate points on the implicit surface. 


Example 7.3.4. Let us consider the de-homogenized version of a system defined 
in [BusO1b]}: 


po =H, 1 = 8, m=, a= 8+4. 


It has one base point, namely (0,0), of multiplicity 4. The toric resultant here 
does not vanish, so it yields the implicit equation 


rin? — vor? + Qe3c1 — 23. 


But under the change of variable tg — t2 — 1 the new system has zero toric 
resultant. The determinant of the perturbed 27 x 27 resultant matrix has a 
trailing coefficient which is precisely the implicit equation. The degree of the 
trailing term is 4, which equals in this case, the number of base points in the 
toric variety counted with multiplicity. 


Example 7.3.5. The problem of computing the sparse, or toric, discriminant of 
a polynomial specified by its support can be formulated as an implicitization 
problem [DS02, GKZ94]. Let us fix the polynomial support in Z™, and suppose 
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that the support’s cardinality equals m+1+s, s > 0. The case s = 2 was 
studied in [DS02] and reduces to curve implicitization, though the approach 
used in that article was not based on implicitization. 

Here s = 3, so we have a surface implicitization problem with base points. 
Base points forming a component of codimension 1 can be easily removed by 
canceling common factors in the numerator and denominator of the rational 
expressions for the xo,...,%s—1. 

The parametric expressions for the x;’s and the ensuing implicitization 
problem shall be defined in terms of the entries of some matrix B, specified 
from the support of the input polynomial. Its row dimension is s and its 
column dimension equals the cardinality of the polynomial support. We do 
not go into the technical details of deriving B from the support. 

Let us consider a specific example with m = 3 and s = 3, hence the sup- 
port cardinality equals 7. The problem reduces to implicitizing the parametric 
surface given by 


it 
r= [|] (603 + tubsy + tabo;)” , t=0,1,2, 
j=1 


where the matrix B= (b,;), fort =0,...,2, 9 =1,...,7, is as follows: 


1 0 =I 0-2. —1 +1 
B= ]01 -1 2 0 -1 -1 
1 1-2 10 -1 =O 


There are base points forming components of codimension 2, including a single 
affine base point (1,—1). Our algorithm constructs a 33 x 33 matrix, whose 
perturbed determinant has a trailing term of degree 3 in e. The corresponding 
coefficient has total degree 14 in x9,2%1,x%2. When factorized, it yields the 
precise implicit equation, which is of degree 9 in wo, 71, Xo. 


7.4 Implicit support 


In this section, we exploit information on the support of the toric resultant 
in order to predict the support of the implicit equation of a parametric (hy- 
per)surface. 

Our approach is to consider the extreme monomials i.e., the vertices of the 
Newton polytope of the toric resultant Res. The output support scales with 
the sparseness of the parametric polynomials and is much tighter than the one 
predicted by degree arguments. In many cases, we obtain the exact support 
of the implicit equation, as seen by applying our Maple program. Moreover, 
it is possible to specify certain coefficients in this equation. Our motivation 
comes mainly from two implicitization algorithms which apply interpolation, 
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namely the direct method of [CGKW0O1] and the one based on perturbations 
(cf. Section 7.3 or [MC92]). 

The initial form In,,(F’) of a multivariate polynomial F in k variables, with 
respect to some functional w : Z* — R, is the sum of all terms in F' which 
maximize the inner product of w with the corresponding exponent vector. Let 
us define 

k= |Ao|-++-++|Anl, 


then w defines a lifting function on the input system, by lifting every support 
point a € A; to (a,w(a)) € Z” x R. This generalizes the linear lifting of 
Section 7.2. The lower hull facets of the lifted Minkowski sum correspond to 
maximal cells of an induced coherent mixed subdivision of Q. If w is sufficiently 
generic, then this subdivision is tight; in the sequel, we assume our mixed 
subdivision is both coherent and tight and denote it by A,. If F; € A; is a 
vertex summand of an 7-mixed cell, then the corresponding coefficient in f; is 
denoted by c;r,. We recall our assumption that the A; span Z”. 


Theorem 7.4.1. The initial form of the toric resultant Res with respect to a 
generic w equals the monomial 


In, (Res) = II [len (7.4) 


i=0 F 


where Vol(-) denotes ordinary Euclidean volume and the second product is 
over all mixed cells of type i in the mixed subdivision A,,. 


For a detailed proof of this theorem, see [Stu94a]. This proof can be 
obtained from the toric resultant matrix construction, by means of the 
subdivision-based algorithm. Let us use the same specialization of the co- 
efficients in terms of a new parameter t, as in the discussion that leads to 
Theorem 7.2.1. Then, the resultant becomes univariate in t and the proof is 
completed by relating, on the one hand, the degree of In,,(Res) in ¢ and, on 
the other, the sum of all exponents in expression (7.4). The latter, for fixed 3, 
equals MV_; = deg fi Res. 

For a generic vector w, the initial form In,,(Res) corresponds to a vertex 
of the Newton polytope of the resultant Res. It is precisely the vertex with 
inner normal w. So, by varying the lifting w, we can compute all vertices of 
this Newton polytope, hence a superset of the resultant’s support. 

A bijective correspondence exists between the extreme monomials and the 
configurations of the mixed cells of the A;. So, it suffices to compute all distinct 
mixed-cell configurations, as discussed in [MC00, MV99]. 

Another (simpler) means of reducing the number of relevant mixed sub- 
divisions is by bounding the number of cells. This bound is usually straight- 
forward to compute in small dimensions (e.g. when n = 2,3) and reduces 
drastically the set of mixed subdivisions. For instance, when studying the im- 
plicitization of a biquadratic surface, the total number of mixed subdivisions 
is 19728, whereas those with 8 cells is 62. 
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In certain special cases, we can be more specific about the Newton polytope 
of the toric resultant. First, its dimension equals k — 2n — 1 [GKZ94, Stu94al. 
Certain corollaries follow: For essential support families (defined in [Stu94a]), 
a 1-dimensional Newton polytope of Res is possible if and only if all polynomi- 
als are binomials. The only resultant polytope of dimension 2 is the triangle; 
in this case the support cardinalities must be 2 and 3. For dimension 3, the 
possible polytopes are the tetrahedron, the square-based pyramid, and poly- 
tope N22 given in [Stu94a]; the support cardinalities are respectively 2,2 and 
3. 

One corollary of Theorem 7.4.1 (and of its proof) is that the coefficients 
of all extreme monomials are in {—1,1} [GKZ91, CE00, Stu94a]. Sturm- 
fels [Stu94a] also specifies, for all extreme monomials, a way to compute their 
precise coefficients. But this requires computing several coherent mixed sub- 
divisions, and goes beyond the scope of the present chapter. 

The so-called Cayley trick introduces a new point set C := {(z,a;, 1) 
doz € Ao} U {(ei, aij, 1) : @=l,...,n, ay € Aj} Cc Zentl where z = 
(0,...,0) € N” is the zero vector and e; = (0,...,0,1,0,...,0) € N” has a 
unit at the i-th position and n — 1 zeroes. 


Theorem 7.4.2. The problem of computing all mized subdivisions of supports 
Ao,.--;An, which lie in Z", is equivalent to computing all regular triangula- 
tions of the set C defined above. This set contains kj +--+-+kn points, where 
ky, = |Ajl. 


Example 7.4.3 (Continued from Example 7.2.2). The Cayley trick in the uni- 
variate case goes as follows. Consider fo = coo + co12, fi = Cio + C1227, then 
the points in the set C appear in the columns of matrix 


0011 
0102]° 


There are two possible triangulations of these points, namely 


(Lo): L} La}): Colla) 


which is the one shown in Figure 7.3, and 


(Lo}-L2} fol). (La) Lo] 2): 


Efficient algorithms (and implementations) exist for computing all regular 
triangulations of a point set [Ram01]. Regular are those triangulations that 
can be obtained by projection of a lifted triangulation. 

We produce a superset of the monomials in the support of the im- 
plicit equation of the input. Consider, as in Section 7.3 the polynomials 
fi(t) = pi(t) — w:q(t), where we ignore the specific values of the coefficients. 
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This is an interesting feature of the algorithm, namely that it considers the 
monomials in the parametric equations but not their actual coefficients. This 
shows that the algorithm is suitable for use as a preprocessing off-line step 
in CAGD computations, where one needs to compute thousands of examples 
with the same support structure in real time. This handles the implicitiza- 
tion of (multiparametric) families of (hyper)surfaces, indexed by one or more 
parameters. 

Of course, the generic resultant coefficients are eventually specialized to 
functions of the x;. Then, any bounds on the implicit degree in the x; may be 
applied, in order to reduce the final support set. One step yields as by-product 
all partial mixed volumes MV_,; for 7 = 0,...,n, and hence the implicit degree 
separately in the x; variables. 

We examine our method on some small examples, and summarize the 
results in Table 7.1 below. 


Example 7.4.4. We consider the Folium of Descartes, shown also in Figure 7.8. 
x = 3t?/(t® +1), y = 3t/(¢ +1). 


2 
1 

-3 2 a () 2 
o 


Fig. 7.8. The Folium of Descartes 


The output monomials are {y?,2°,x° y?,xy,y? x7}. After applying the 
degree bound d = 3 we obtain the support {y?,x?,xy}, which is optimal, 
since the implicit equation is 27 + y? —32y=0. 


Example 7.4.5. An example in 3 dimensions comes from [Buc88b]; the surface 

is drawn in Figure 7.9. The parametric expressions are: x = st, y= s t?, z= 
2 

St. 


In order to apply toric elimination theory, we consider polynomials 
fo = coo — corst, fr = C10 — Crist”, fo = C29 — C218”. 


There are the following two possible mixed subdivisions, each containing ex- 
actly three maximal cells, all of which are mixed, see Figure 7.10. 
The computed support is optimal and the implicit equation is x+—y?z = 0. 
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c21 


Fig. 7.10. Mixed cells in the subdivisions, with vertex summands shown. 


Example 7.4.6. Let us consider a system attributed to Froéberg and discussed 
in Chapter 1. 


a = t28 — 456 _ 460 _ 462 __ 463 y — 482, 


The Minkowski sum is the segment Qo + Qi = [0,95]. One type of triangu- 
lations, obtained from a non-linear lifting, divides it to the following 3 cells 
(which are all segments): 


(Qo +9), (a+ Q1), (Q5 +32), where Qo = (0, a], QG = [a, 63], 


and a € Ap = {0, 48, 56, 60, 62,63}. Every such triangulation yields a support 
point y*. The triangulation (0+ @Q1), (Qo +32), which is induced from a linear 
lifting, yields support point x32. Note that only certain of these monomials 
are extreme when we consider the resultant in terms of all input coefficients, 
in order for the respective coefficients to lie in {—1, 1}. 

Therefore, we find, as the toric resultant support, the triangle with vertices 
(32,0), (0,48) and (0,63). Equivalently, it is delimited by the y-axis and the 
lines y = —(3/2)a + 48 and y = —(63/32)a + 63, as shown in Figure 7.11. 

Counting the points with integer coordinates inside (and on the sides) of 
the triangle, we see that there are 257 such points, which is seen to be optimal 
by actually computing the resultant. 
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Fig. 7.11. Toric resultant support. 


Table 7.1. Predicting the implicit support. 


Bechler: Input Degree of General ## monomials 
Degree Implicit Eq. # monomials from [EK03] 
Unit Circle 2 2 6 3 
Descartes Folium, Ex. 7.4.4 3 3 10 3 
Froberg-Dickenstein, Ex. 7.4.6 63 63 1057 257 
Buchberger, Example 7.4.5 1,2 4 35 2 
Busé, Example 7.3.4 3 5 56 4 
Bilinear, Example 7.1.9 1,1 2 10 9 


Example 7.4.7. The well-known bicubic surface represents a challenge for our 
current implementation: z = 3 t (t—1)?+(s—1)?+3 s, y =3 s (s—1)?+t?4+3t, 
z= —3s(s? —5s+5)t? —3(s? + 6s? — 95+ 1)t?+ t(6s3 + 9s? — 18s+3) —38(s— 
1). We computed 737129 regular triangulations (by TOPCOM) [Ram01]. For 
illustration purposes, we show one of them: 


{2,3,4,7,13},{3,4,5,7,13},{3,5,6,7,13},{3,6,9,13,14}, 
{6,9,12,13,14},{3,6,9,14,15},{6,9,12,14,15},{6,12,13,14,16}, 
{6,12,14,15,16},{6,12,15,16,17+,{3,6,9,15,18},{6,9,12,15,18}, 
{6,12,15,17,18},{3,9,15,18,19},{3,6,9,18,19},{6,9,12,18,19}, 
{6,12,16,17,20},{6,12,17,18,20},{3,6,9,19,23},{6,9,12,19,23}, 
{6,12,19,22,23},{6,12,22,23,24},{6,12,23,24,25},{3,6,9,23, 26}, 
{6,9,12,23,26},{6,12,23,25,26},{0,2,4,7,13},{3,6,7,9,13}, 
{6,12,18,19,22},{6,12,18,20,24},{6,7,9,12,13},{6,12,18,22,24}. 


The size of the file is 383 MBytes. This underlines the fact that we should not 
compute all regular triangulations but only the mixed-cell configurations. 
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7.5 Algebraic solving by linear algebra 


To solve well-constrained system (7.1) by the resultant method we define an 
overconstrained system and apply the resultant matrix construction. For a 
more comprehensive discussion the reader may refer to Chapters 2 and 3, 
or [CLO98, EM99c]. 

One advantage of resultant-based methods is that resultant matrix M 
need be computed only once, for all systems with the same supports. So this 
step is thought of as being carried out off-line, while the matrix operations to 
approximate all isolated roots for each coefficient specialization constitute the 
online part. Numerical issues for the latter are discussed in [Emi97, EM99c]. 

Resultant matrices reduce system solving to certain standard operations in 
computer algebra. In particular, univariate or multivariate determinants can 
be computed by evaluation and interpolation techniques. However, the de- 
terminant development in the monomial basis may be avoided because there 
are algorithms for univariate polynomial solving as well as multivariate poly- 
nomial factorization which require only the values of these polynomials at 
specific points; cf. e.g. [Pan97]. All of these evaluations would exploit the 
quasi-Toeplitz structure of Sylvester-type matrices [CKL89, EP02]. 

We present two ways of defining an overconstrained system. The first 
method adds to the given system an extra polynomial, namely 


fo = Ug tury t++-+Untn € CK [tits s.2ztig Ge oct |, 


thus yielding a well-studied object, the u-resultant. Coefficients u1,..., Un, may 
be randomly specialized or left as indeterminates; in the latter case, solving 
reduces to factorizing the u-polynomial. It is known that the u-resultant fac- 
torizes into linear factors ug + wpa, +++: + UnQ, where (a1,...,Q@) is an 
isolated root of the original system. This is an instance of Poisson’s formula. 
Now, wo is usually an indeterminate that we shall denote by x below for 
uniformity of notation. Matrix M will describe the multiplication map for fp 
in the coordinate ring of the ideal defined by the system in (7.1). 

An alternative way to obtain an overconstrained system is by hiding one of 
the original variables in the coefficient field and consider the system as follows 
(we modify the previous notation to unify the subsequent discussion): 


ieee € (K[z0]) rere coal 


M is a matrix polynomial in xp, and may not be linear. 

An important issue concerns the degeneracy of the input coefficients. This 
may result in the trivial vanishing of the toric resultant or of det MZ when 
there is an infinite number of common roots (in the torus or at toric infinity) 
or simply due to the matrix constructed. An infinitesimal perturbation has 
been proposed [DEO1a] which respects the structure of Newton polytopes and 
is computed at no extra asymptotic cost, cf. Section 7.3. 
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The perturbed determinant is a polynomial in the perturbation variable, 
whose leading coefficient is nonzero whereas the least significant coefficient 
is det M. Irrespective of which coefficients vanish, there is always a trailing 
nonzero coefficient which vanishes when 2 takes its values at the system’s 
isolated roots, even in the presence of positive-dimensional components. This 
univariate polynomial is known as a projection operator because it projects 
the roots to the x9-coordinate. Univariate polynomial solving thus yields these 
coordinates. Again, the u-resultant allows us to recover all coordinates via 
multivariate factoring. 

A basic property of resultant matrices is that right vector multiplication 
expresses evaluation of the row polynomials. Specifically, multiplying by a 
column vector containing the values of column monomials g at some a € (K- \? 
produces the values of the row polynomials 


a fe( a): 


Computationally it is preferable to have to deal with as small a matrix as 
possible. To this end we partition M into four blocks Mj; so that the upper 
left submatrix 1; is square, independent of xo, and of maximal dimension 
so that it remains well-conditioned. 

If the matrix is obtained from the subdivision-based algorithm, then we 
know that MM), corresponds to the integer points in the 0-mixed cells. More 
precisely, the columns of Mj, are indexed by those points, whereas its rows 
contain the multiples of fo with the corresponding monomials. It can be proven 
that these monomials form a basis of the quotient ring defined by the ideal 
of fi,-.-,;fn, namely K[xZt,...,a2="]/(fi,..., fn). For a proof, see [Emi96, 
PS96]. 

Once My is specified, let A(2‘9) = Mo2(x0) a Mo; (%9)Mj* Mi2(20)- To 
avoid computing M;,', we may use its LU (or QR) decomposition to solve 
My, X = M2 and compute A= Mo2 = Mon,X. 

Let € be the monomial set indexing the rows and columns of M and let 
BC € index A. If (ao,a) € K”* is a common root with a € K”, then 
det A(ao) = 0 and, for any vector uv’! = [---a%---], where qg ranges over B, 
A(ao)v’ = 0. Moreover, 


My, M 0 
Po ada) | [0] = [0] = atse + alone’ =o 


determines v once v’ has been computed. Vector [v, v’] contains the values of 
every monomial in € at a. 

It can be shown that € affinely spans Z” and an affinely independent 
subset can be computed in polynomial time [Emi96]. Given v, v’ and these 
points, we can compute the coordinates of a. If all independent points are in 
B then v’ suffices for solving. To find the vector entries that will allow us to 
recover the root coordinates, it is typically sufficient to search in B for pairs of 
entries corresponding to qi, q2 such that qi — q2 = (0,...,0,1,0,...,0). This 
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lets us compute the i-th coordinate, if the unit appears at the i-th position. 
In general, the problem of choosing the best vector entries for computing the 
root coordinates is open, and different choices may lead to different accuracy. 

To reduce the problem to an eigendecomposition, let r be the dimension 
of A(xo), and d > 1 the highest degree of wp in any entry. We wish to find all 
values of 29 at which 


A(x) = at Ag + at Ag 4 fteeet ro Ay + Ag 


becomes singular. These are the eigenvalues of the matrix polynomial. Fur- 
thermore, for every eigenvalue X, there is a basis of the kernel of A(X) defined 
by the right eigenvectors of the matrix polynomial associated to X. If Ag is 
nonsingular then the eigenvalues and right eigenvectors of A(x) are the eigen- 
values and right eigenvectors of monic matrix polynomial A, A(ao). This is 
always the case when adding an extra linear polynomial, since d = 1 and 
A; =I is the r x r identity matrix; then 


A(2o) _ —A;(—Aj{ "Ao = xol). 


Generally, the companion matrix of a monic matrix polynomial is a square 
matrix C' of dimension rd. The eigenvalues of C’ are precisely the eigenvalues 
X of Az A(zo), whereas its right eigenvector w = [v1,..., Uva] contains a right 
eigenvector 1 of AZ'A(xo) and v; = A‘, for i= 2,...,d. 

We now address the question of a singular Ag. The following rank bal- 
ancing transformation in general improves the conditioning of Ag. If matrix 
polynomial A(ao) is not identically singular for all zo, then there exists a 
transformation x +> (t1y + t2)/(tsy + t4) for some t; € Z, that produces a 
new matrix polynomial of the same degree and with nonsingular leading co- 
efficient. If Aq is ill-conditioned for all linear rank balancing transformations, 
then we build the matrix pencil and apply a generalized eigendecomposition 
to solve Cia + Co. This returns pairs (a, 3) such that matrix Cla + Co is 
singular with an associated right eigenvector. 
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Summary. In a 1996 paper, Andrew Sommese and Charles Wampler began de- 
veloping a new area, “Numerical Algebraic Geometry”, which would bear the same 
relation to “Algebraic Geometry” that “Numerical Linear Algebra” bears to “Linear 
Algebra”. 

To approximate all isolated solutions of polynomial systems, numerical path 
following techniques have been proven reliable and efficient during the past two 
decades. In the nineties, homotopy methods were developed to exploit special struc- 
tures of the polynomial system, in particular its sparsity. For sparse systems, the 
roots are counted by the mixed volume of the Newton polytopes and computed by 
means of polyhedral homotopies. 

In Numerical Algebraic Geometry we apply and integrate homotopy continua- 
tion methods to describe solution components of polynomial systems. In particular, 
our algorithms extend beyond just finding isolated solutions to also find all posi- 
tive dimensional solution sets of polynomial systems and to decompose these into 
irreducible components. These methods can be considered as symbolic-numeric, or 
perhaps rather as numeric-symbolic, since numerical methods are applied to find 
integer results, such as the dimension and degree of solution components, and via 
interpolation, to produce symbolic results in the form of equations describing the 
irreducible components. 

Applications from mechanical engineering motivated the development of Numer- 
ical Algebraic Geometry. The performance of our software on several test problems 
illustrates the effectiveness of the new methods. 


* This material is based upon work supported by the National Science Foundation 

under Grant No. 0105653; and the Duncan Chair of the University of Notre Dame. 

** This material is based upon work supported by the National Science Foundation 
under Grant No. 0105739 and Grant No. 0134611. 
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8.0 Introduction 


The goal of this chapter is to provide an overview of the main ideas developed 
so far in our research program to implement numerical algebraic geometry, 
initiated in [SW96]. 

We are concerned with numerically solving polynomial systems. While the 
homotopy continuation methods of the past were limited to approximating 
only the isolated roots, we developed tools to describe all positive dimen- 
sional irreducible components of the solution set of a polynomial system. In 
particular, our algorithms produce for every irreducible component a wit- 
ness set, whose cardinality equals the degree of the component, as this set is 
obtained by intersecting the component with a general linear space of com- 
plementary dimension. A point of a witness set corresponds to what is known 
in algebraic geometry as a generic point. Our main results [SV00, SVWOla, 
SVWO1b, SVWOlc, SVW02c, SVW02b, SVW02a, SVW03, SVW, SVW04] 


can be summarized in four items: 


1. In [SV00] we presented a cascade of homotopies (extended in [SVW]) to 
find candidate witness points for every component of the solution set. Sep- 
arating the junk from the candidate witness points was done in [SVWOlal], 
where factorization methods based on interpolation implemented a numer- 
ical irreducible decomposition. The use of central projections and a homo- 
topy membership test to filter junk were the improvements of [SVWO1b]. 

2. The treatment of high-degree components and components of multiplic- 
ity greater than one can present numerical challenges. The use of mon- 
odromy [SVW0O1c] followed by the validation by the linear trace [SVW02c] 
enabled us to deal with high degree components of multiplicity one, using 
only machine floating point numbers. In [SVW02b], we presented an ap- 
proach to tracking paths on sets of multiplicity greater than one, which 
in theory makes the algorithm for irreducible decomposition completely 
general, although in practice this portion of the framework needs further 
refinement. However, for the case of the factorization of a single multi- 
variate polynomial, we can use differentiation to reduce the treatment of 
higher multiplicity components to nonsingular path tracking, as we de- 
scribed in [SVW04]. This addresses an open problem in symbolic-numeric 
computing: the factorization of multivariate polynomials with approxi- 
mate coefficients [Kal00]. 

3. Our new homotopy algorithms have been implemented and tested using 
the path trackers in the software package PHCpack [Ver99a]. In [SVW03] 
we outlined the new tools in PHCpack and described a simple interface 
to Maple. Our software found the degrees of all irreducible components 
of the cyclic 8 and 9 roots problems, which previously could only be done 
via Grébner bases (and only by the very best implementation [Fau99]). 

4. Polynomial systems with positive dimensional components occur natu- 
rally when designing mechanical devices which permit motion. We inves- 
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tigated a special case of a moving platform, discovering through a nu- 
merical irreducible decomposition [SVW02c] a component not reported 
by experts [HK00]. This and other applications of our tools to systems 
coming from mechanical design are described in [SVW0O2a]. 


In this chapter we will introduce these results, first explaining homotopy meth- 
ods for isolated solutions. We can only mention some recent and exciting 
new developments in fields related to numerical algebraic geometry: numerical 
Schubert calculus ({HSS98], [HV00], [LWW02], [SSO1], [(VW02]) and numerical 
jet geometry [RSV02). 


8.1 Homotopy continuation methods — an overview 


Homotopy continuation methods operate in two stages. Firstly, homotopy 
methods exploit the structure of the system f(x) = 0 to find a root count 
and to construct a start system g(x) = 0 that has exactly as many regular 
solutions as the root count. This start system is embedded in the homotopy 


h(x,t) = y1 —t)g(x) +tf(x) =0, te [0,1], (8.1) 


with y € C a random number. Secondly, as t moves from 0 to 1, numerical 
continuation methods trace the paths that originate at the solutions of the 
start system towards the solutions of the target system. The good properties 
we expect from a homotopy are (borrowed from [Li97, Li03]): 

1. (triviality) The solutions for t = 0 are trivial to find. 

2. (smoothness) No singularities along the solution paths occur (because 
of ¥). 

3. (accessibility) An isolated solution of multiplicity m is reached by exactly 
m paths. 

Continuation or path-following methods are standard numerical techniques 
([AG90a, AG93, AG97], [Mor87], [Wat86, Wat89]) to trace the solution paths 
defined by the homotopy using predictor-corrector methods. The smoothness 
property of complex polynomial homotopies implies that paths never turn 
back, so that during correction the parameter t stays fixed, which simplifies 
the set up of path trackers. The adaptive step size control determines the 
step length while enforcing quadratic convergence in Newton’s method to 
avoid path crossing (see also [K X94] for the application of interval methods 
to control the step size). At the end of the path, end games ([HV98], [MSW91, 
MSW92a, MSW92b], [SWS96]) deal with diverging paths and paths leading 
to singular roots. 

Following [HSS98], we say that a homotopy is optimal if every path leads 
to one solution. The classification in Table 8.1 (from [Ver99b]) contains key 
words for three classes of polynomial systems for which optimal homotopies 
are available in PHCpack [Ver99a]. These homotopies have no diverging paths 
for generic instances of polynomial systems in their class. 
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| system model theory space 

| dense highest degrees Bézout Pp” projective 

| sparse Newton polytopes |Bernshtein|(C*)” toric 
\determinantal localization posets} Schubert | Gm, |Grassmannian 


Table 8.1. Key words of the three classes of polynomial systems. 


The earliest applications of homotopies for solving polynomial systems 
([CMPY79], [Dre77], [GZ79], [GL80], [Li83], [LS87] [Mor83], [Wri85], [Zul88}) 
belong to the dense class, where the number of paths equals the product of 
the degrees in the system. Multi-homogeneous homotopies were introduced 
in [MS87b, MS87a] and applied in [WMS90, WMS92], see also [Wam92]. Sim- 
ilar are the random product homotopies [LSY87a, LSY87b], see also [Li87] 
and [LW91]. Methods to construct linear-product start systems were intro- 
duced in [VH93], and extended in [VC93, VC94], [LWW96], and [WSWO00]. A 
general approach to exploit product structures was developed in [MSW95]. 

Almost all systems have fewer terms than allowed by their degrees. Im- 
plementing constructive proofs of Bernshtein’s theorems [Ber75], polyhedral 
homotopies were introduced in [HS95] and [VVC94] to solve sparse sys- 
tems more efficiently. These methods provided ways to start cheater’s ho- 
motopies ({[LSY89], [LW92]) and special instances of coefficient-parameter 
polynomial continuation ([MS89, MS90]). The root count requires the cal- 
culation of the mixed volume’, for which a lift-and-prune approach was 
presented in [EC95]. Exploitation of symmetry was studied in [VG95] and 
the dynamic lifting of [VGC96] led to incremental polyhedral continuation. 
See [Ver00] for a Toric Newton. Extensions to count all affine roots (also those 
with zero components) were proposed in [EV99], [GLW99], [HS97b], [LW96], 
[Roj94, Roj99b], and [RW96]. Very efficient calculations of mixed volumes are 
described in [DKK03], [GLO0, GLO3], [KK03b], [LLO1], and [TKF02]. 

Determinantal systems (with equations like det(A|X) = 0) arise in prob- 
lems of enumerative geometry. The homotopies in numerical Schubert calculus 
first appeared explicitly in [HSS98}, originating from questions in real enumer- 
ative geometry [Sot97a, Sot97b]. While real enumerative geometry [Sot03] 
is interesting on its own, these homotopies solve the pole placement prob- 
lem ({[Byr89], [RRW96, RRW98], [Ros94], [RW99]) in control theory. Recent 
improvements and applications can be found in [HV00], [LWW02], [SS01], 
and [VW02]. 

We end this section noting that homotopies have a wider application 
range than “just” solving polynomial systems, see for instance [Wat02] for 
a survey, [WBM87], and [WSM*97] for a description of HOMPACK. The 


4 The mixed volume was nicknamed in [CR91] as the BKK bound to honor Bern- 
shtein [Ber75], Kushnirenko [Kus76], and Khovanskii [Kho78b]. 
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speedup of continuation methods on multi-processor machines has been ad- 
dressed in [ACW89, CARW93, HW89]. 


8.2 Homotopies to approximate all isolated solutions 


We first prove the regularity and boundedness of the solution paths defined 
by homotopies, before surveying path following techniques. We obtain more 
efficient homotopies by exploiting product structures and using Newton poly- 
topes to model the sparsity of the system. 


8.2.1 Regularity and boundedness of solution paths 


To illustrate how homotopy methods work, let us consider a simple example 
of solving two quadrics: 


on ae 4) . 


2y? — x 


To solve f(a,y) = 0, we match it with a start system of two easily solved 


quadr 1CS: 
( ’ ) Z ] ? 
g x y y? 


with which we form the following homotopy: 


g2—1 x? + 4y?—4 
ne.nt) = (S27) a-9+( ioe) (8.2) 
Att=1, h(z,y,t = 1) = 0 is f(x,y) = 0, the system we wish to solve while 
at t= 0, h(a, y,t = 0) = 0 is the start system g(x,y) = 0 we can easily solve. 
As we usually move t from 0 to 1 when we solve the system, we may view the 
movement of ¢ from 1 to 0 as a degeneration of the system, i.e., we deform 
the general hypersurfaces into degenerate products of hyperplanes. 

But does this work? We will see in a moment that it does not, but that 
there is a simple maneuver that fixes the trouble once and for all. For numerical 
solving, we would need the solution paths to be free of singularities. A singu- 
larity occurs where the Jacobian matrix J}, of the homotopy h(a, y,t) = 0 has 
a zero determinant. The singularities along the solution paths are solutions of 
the system 


h(z,y,t) =0 _ | Qa 8yt 
Crees ay ee he Qy + Qyt |’ ie) 


If this “discriminant system” has any roots with t € [0,1), there is at least 
one homotopy solution path with singularities. To explore this situation, let’s 
solve this system by elimination. This is not a step that we normally perform 
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in the course of solving f(x) = 0, but we do it here to reveal the flaw in the 
naive homotopy of (8.2) and to illustrate how we fix the flaw. To solve this 
discriminant system, we will eliminate from the system the variables x and y 
to obtain one polynomial in the continuation parameter t. The roots of this 
polynomial define the singularities along the solution paths. 

While there are many ways to perform this elimination, we let Maple 
compute a lexicographical Grobner basis of the discriminant system. Below 
are the Maple commands, to save space we suppressed most of the output. 


eh := expand(h) ; 
jh := matrix(2,2, 
[(diff(eh[1],x),diff(eh[1],y)], 
[diff (eh[2] ,x) ,diff(eh[2],y)]]); 
> sys := [eh[1],eh[2], # discriminant system solved by 
linalg[det](jh)]; # pure lex Groebner basis in gb 
> gb := grobner[gbasis] (sys, [x,y,t] ,plex); 
> gblnops(gb)] ; # discriminant polynomial 
3 5 4 2 7 6 
-1P+t+10t +29t +13 t -5t +12t + 21t 


expanded homotopy 
Jacobian matrix 


> £f := [x°2 + 4*y*2 - 4,2*y72- x]; target system 
>g := [x2 - 1, yo2 - 1]; start system 
>h := t*f + (1-t)*g; the homotopy 
> 

> 


# HH H H 


As the degree of this “discriminant polynomial” is seven, we have seven roots: 
poly. , 


> fsolve(gb[nops(gb)] ,t, complex) ; # numerical solving 
-.8818537646 - .9177002576 I, -.8818537646 + .9177002576 I, 
-.2011599690 - .8877289373 I, -.2011599690 + .8877289373 I, 
.006853764567 - .3927967328 I, .006853764567 + .3927967328 I, 
- 4023199381 


We are troubled by the root around 0.4, because, as t moves from 0 to 1, we 
will encounter a singularity. So our homotopy in (8.2) does not work! 

We can fix this problem by the choice of a random constant y = geval, 
for some random angle 0. Now, consider the homotopy 


meut=1(Fo7)a-o4(" TA). wa) 


2y? — x 


The random choice of y will cause all roots of the discriminant polynomial 
to lie outside the interval [0,1). That ¢ = 0 is excluded is obvious (because 
the start system has only regular roots), but at t = 1 we may find singular 
solutions of the given system f. 


Exercise 8.2.1. Modify the homotopy in the sequence of Maple commands 
above taking h := t*f + (1+I)*(1-t)*g; and verify that none of the roots 
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of the discriminant polynomial is real. The choice of y as 1 + /—1 does not 
give the Grobner package of Maple a hard time. If Maple is unavailable, then 
another computer algebra system should do just as well. 


The above example illustrates the general idea behind the regularity of 
solution paths defined by a homotopy. The main theorem of elimination theory 
says that the projection of an algebraic set in complex projective space is 
again an algebraic set. Consider the discriminant system as a polynomial 
system in x,t, and y. If we eliminate x, we obtain a polynomial in ¢ and 4. 
This polynomial does not vanish entirely as the start system (at t = 0) has no 
singular roots. Thus it has only finitely many roots for general 7. Furthermore, 
a random complex choice of y will insure that all those roots miss the interval 
(0,1). A schematic (as in [Mor87]) illustrating what cannot and what can 
happen is in Figure 8.1. 


x(t) X(t) 


> > 


t t 


Fig. 8.1. By a random choice of a complex constant y, singularities will not occur 
for all t € [0,1) as on the left, but they may occur at the end, for t = 1. 


The same random constant y ensures that all paths stay bounded for 
all t € [0,1). By this we mean that no path diverges to infinity for some 
t € [0,1). Equivalently, for all t € (0,1), the system A(x,t) = O has 
no solutions at infinity (see Figure 8.2). To see this, invoke a homoge- 
neous coordinatetransformation introducing one extra coordinate, and con- 
sider the system in projective space. That is, consider the homogenized system 
H(X,Y,Z,t) = 0 obtained by clearing Z from denominators in the expres- 
sion h(X/Z,Y/Z,t) = 0. Now, instead of the discriminant system of (8.3) our 
concern is the system 


H(X,Y, Z,t) =0 
Z=0 


Since h is homogeneous in X, Y, Z, the solutions live in projective space, which 
we can restate to say that all solutions to H(X, Y,0,t) = 0 must either satisfy 
H(X/Y,1,0,t) =0 or H(1,Y/X,0,t) = 0 (or both, if neither X or Y is zero). 
Either of these is a system of two polynomials in two variables and y and so 
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one can again apply elimination and see that, except for special choices of 7, 
there will be no solutions at infinity for ¢ € (0,1). 

Note that if the polynomials in the start system g(x,y) = 0 have lower 
degrees than their counterparts in f(x,y) = 0, then H(X,Y, Z,t) = 0 could 
have solutions at infinity for t = 0. By matching the degrees of the polynomials 
in g and f, we avoid this, which is key in proving the third property of a good 
homotopy: accessibility. 


Exercise 8.2.2. Consider the homotopy 


newd=({%2 175) a-94 {Ho 3r5 Ve 


For which values of t do we have diverging paths? Show that with a random 
complex constant y in h(a, y, t) = 0 (as in (8.4)) there are no divergent paths. 


x(t) y L x(t) 


> a 


t t 


Fig. 8.2. By a random choice of a complex constant y, divergence will not occur 
for all t € [0,1) as on the left, but may occur at the end, for t = 1. 


To understand why the homotopy has the accessibility property (defined 
in Section 8.1), consider that whenever the number of equations is equal to 
the number of variables x, continuity implies that an isolated root at t = 1 
must be approached by at least one isolated root as t — 1. Since there are no 
singularities or solutions at infinity for t in [0,1), we can carry this argument 
backwards all the way to t = 0, where we know we are starting with all the 
solutions of the homotopy. 

The arguments described above can be found in [BCSS98], see also [LS87]. 


8.2.2 Path following techniques 


Consider any homotopy h;(ax(t), y(t), t) = 0, k = 1,2. Since we are interested 


to see how x and y change as t changes, we apply the operator 2 on the 


ot 
homotopy. Via the chain rule, we obtain 
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Oh, Oy — Ohy, 
Oy Ot Ot 


Ohx Ox 
Ox Ot 


=0, k=1,2. 


Denote Ag := oe and Ay := oy For fixed t (after incrementing t := t+ At), 
for k = 1,2, we solve the linear system 


[ai] -[4] 

Ay a 

and obtain (Az, Ay), the tangent to the path. For some step size \ > 0, the 
updates «7 := 7+ AAg and y:= y+ AAy give the Euler predictor. 

To avoid solving a linear system at each predictor step, we may use a 
secant predictor. A secant predictor is less accurate and will require more 
corrector steps, but the total amount of work for the prediction can be less. 
Cubic interpolation, using the tangent vectors at two points along the path, 
leads to the Hermite predictor. See Figure 8.3 for a comparison. 


Ohy Ohi 
Ox Oy 


Ox Oy 


0.45 


0.3; three predictors 


0.24 


Fig. 8.3. Three predictors: secant, Euler, and Hermite. 


The predictor delivers at each step of the method a new value of the contin- 
uation parameter and predicts an approximate solution of the corresponding 
new system in the homotopy. Then, the predicted approximate solution is 
corrected by applying the corrector, e.g., by Newton’s method. With a good 
homotopy, the solution paths never turn back as t increases. Therefore, the 
continuation parameter can remain fixed while correcting the predicted so- 
lution. This leads to so-called increment-and-fix path following methods. In 
practice, determining the step length during the prediction stage is done by a 
hit-or-miss method, which can be implemented by means of an adaptive step 
size control, as done in the algorithm below. 
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Algorithm 8.2.3 Following one solution path by an increment-and-fix 
predictor-corrector method with an adaptive step size control strategy. 


Input: h(x,t), x* € C”: h(x*,0) = 0, 
€>0, mazit, max_steps, 
min_step_size, max_step_size. 

Output: x*, success if ||h(x*,1)]|| <e. 


t:=0; k:=0; 

A := max_step_size; 

oldt:=t; old_x* := x* 

previous_x* := x*; 

stop := false; 

while ¢t < 1 and not stop loop 
t :=min(1,¢+ A); 
x* := x* + A(x* — previous_x*); 
Newton(h(x, t),x*, €, max-_it,success); 


homotopy and root 
defines stop criteria 
for step size control 


approximate root at end 


initialization 
step length 

back up for t and x* 
previous solution 
combines stop criteria 


secant predictor for t 
secant predictor for x* 
correct with Newton 


if success step size control 
then A := min(Eapand(A), max_step_size); enlarge step length 
previous_x* := old_x*; go further along path 
old_t :=t; old_x* := x*; new back up values 
else \:= Shrink(A); reduce step length 
t:= oldt; x* := old_x*; step back and try again 

end if; 
k:=k+1,; augment counter 
stop := (A < min_step_size) 1st stop criterion 
or (k > max-_steps); 2nd stop criterion 

end loop; 
success := (||h(x*,1)|| < €). report success or failure 


The path following algorithm contains three key ingredients in its loop: the 
predictor, the corrector and the step size control. The step size is controlled 
by the functions Shrink and Expand which respectively reduce and enlarge 
A, depending on the outcome of the corrector. 

The algorithm is still abstract because we did not specify particular values 
for the constants, such as tolerances on the solutions, minimal and maximal 
step size, maximum number of iterations of Newton’s method, etc. 


8.2.3 Homotopies exploiting product structures 


A typical homotopy looks as follows: 
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h(x, t) = g(x) —t) + f(xjt=0, EC, 


where a random y ensures the regularity and boundedness of the paths. 
In general, for a system f = (fi, fo,..-, fn), with d; = deg(fi), we set up 
a start system g(x) = 0 as follows: 


ayn = By =0 
agxd? = Bo =0 
g(x) = 


Ob te _ Bn =0 


where the coefficients a; and (;, for i = 1,2,...,n, are chosen at random in C. 
Therefore g(x) = 0 has exactly as many regular solutions as the total degree 
D= Ws d;. So this homotopy defines D solution paths. The theorem of 
Bézout (which can be proven constructively via a homotopy) indeed predicts 
D as the number of solutions in complex projective space. 


Exercise 8.2.4. Consider the following polynomial system: 


g18 4 1 1y54 — 1.1y=0 
yi + 1.1254 -1lz=0° 


This system was constructed by Bertrand Haas [Haa02] who provided with 
this system a counterexample to the conjecture of Kushnirenko on the number 
of real roots of sparse systems. Use phe (available via [Ver99a]) to determine? 
how many solutions of this system are complex. How many are real? 


In almost all applications, the systems have far fewer solutions than the 
total degree (most solutions lie at infinity and are of no interest). Consider 
the eigenvalue problem Ax = Ax, A € C”*". To make the system square, 
we can add one general hyperplane to obtain a unique x for every 4. If we 
apply Bézout’s theorem in a straightforward manner, we consider Ax = Ax 
as a system of n quadrics and obtain a homotopy with D = 2” to trace, 
whereas we know there can be at most n solutions! This is a highly wasteful 
computation, as 2" — n of our solution paths are certain to diverge to infinity. 

Let us examine the smallest nontrivial case: n = 2. We consider a general 
2-by-2 matrix A and scale the components of the eigenvector with a random 
hyperplane co + ¢1%1 + ce%2 = 0. So we look at the system 


A442, 1 A14QX%Q — AX =0 
f(@1, 22, d) = 49121 + A99%Q — AL = (i, 
Co + C121 + Co%2 = 0 


To compute the solutions at infinity, we go to homogeneous coordinates, re- 
placing x, by x1/a9, v2 by x2/x9, and A by A/a. Clearing denominators: 


° This may take some time (especially on slower machines)... 
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A141 %QpL1 + A1QX%OXQ — AL =) 
f (0, %1, £2, A) = ¢ G21 %0%1 + ag2%9%2 — AX2 =0. 
Colo +121 + CoX2Q = 0 


Solutions at infinity are solutions of the homogeneous system with ro = 0 
and not all components equal to zero. If AX = 0, then (%,21,22,A) = 
(0,1, —c1/c2,0) represents one point at infinity. If A 4 0, then the other so- 
lution at infinity is represented by (20,21,22,A) = (0,0,0,1). So we found 
where two of the four paths are diverging to. 

Now we embed our problem in multi-projective space: P x P?, separating 
A from x. To go to 2-homogeneous coordinates, we replace x2 by x2/Xo, ©1 
by 21/20 (as before), and A by Ai/Ao (this is new), clearing denominators: 


a41A0%1 + a12AQX2 — A, x1 = O 
f(@0, 1, £2, 0, A1) = 4 G21A0#1 + ag2A0%2 — A1%@2 = 0. (8.5) 
Coto +121 + Co%2 = 0 


Looking for roots at infinity of (8.5) we see that Ap = 0 implies x, = 0, 
v2 = 0, and thus x) = 0, so we have no proper solution at infinity with 
Xo = 0. For the solutions at infinity of (8.5) with x = 0, considering (8.5) 
back in affine coordinates for A (as Xo cannot be zero), we are looking at a 
homogeneous system of three equations in three unknowns: x1, £2, and X. For 
general matrices, the trivial zero solution is the only solution. Thus in P x P?, 
the general eigenvalue problem has no solutions at infinity. 

To arrive at a version of Bézout’s theorem for polynomial systems over 
multi-projective spaces, we need to define our root count. Continuing our 
running example, we record the degrees in A and {21, 22} of every equation in 
a table. Corresponding to this degree table is a linear-product start system, 
written in (8.6) in table format. 


{A} {21, v2} | {A} {x1, £2} 
(1)|} 1 1 (1) Jaro + a11A) G10 + Biv + Pi2%e2 
(2)|} 1 1 <= — |(2)]a20 + a21A}G20 + B21%1 + Bo2a2 (8.6) 
(3)||_ 0 1 (3) 1 B30 + 83101 + P3222 
degree table linear-product start system 


The coefficients a;; and §;; in (8.6) are randomly chosen complex numbers. 
Except for a special choice of these numbers, the linear-product start system 
will always have two regular solutions. We derive a formal root count following 
the moves we make to solve the linear-product start system: 


B= 1x 1x 1+1x i1xi14+0+%x.d1é=~*x it. 
(1I)r (2)x (B)x (2)a Ax (8)x (3)A Ax (2) x 


The labels in (8.7) show the navigation through the table at the right of (8.6). 


(8.7) 


Exercise 8.2.5. The matrix polynomial 
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p(A) = Agd* + Agia At * +++ + AyA+ Ap, Ape C™*™, 


defines the generalized eigenvalue problem p(A)x = 0. How many generalized 
eigenvalue-eigenvector pairs can we expect for randomly chosen matrices A;? 


To show that B is an upper bound for the number of isolated solutions of 
a polynomial system, we show the regularity and boundedness of the solution 
paths in a typical homotopy, using a linear-product start system. 

For many applications (like the eigenvalue problem) it is obvious how 
best to separate the variables into a partition. But for black-box solvers and 
systems with no apparent product structure, we need to find that partition 
which leads to the smallest Bézout number. One strategy is to enumerate all 
partitions and retain the partition with the smallest Bézout number. While 
the number of partitions grows faster than 2”, finding the smallest Bézout 
number for n = 8 by enumeration takes less than a second of CPU time. 

Instead of using one partition of the variables to model the product struc- 
ture of the system, we may use different partitions for different equations, 
and extend this even further to construct in this way general linear-product 
start systems. The solving of the start system now involves more work, but 
we may expect the homotopy to be more efficient. Schematically, a hierarchy 
of homotopies (and root counting methods) is given in Figure 8.4. 


Coefficient-Parameter 


@-----U-----5 OU 


Newton 
Polytopes 


Polynomial 
Products 


| 
| 
! 
I 
U eee meee | 
more efficient 
Linear Products (fewer paths) 


U 


Multihomogeneous 


U 


Total Degree 


easier 
start 
system 


Fig. 8.4. A hierarchy of homotopies. All homotopies below the dashed line A can be 
done automatically. Above the line, apply special ad-hoc methods or bootstrapping. 
Homotopies at the bottom of the hierarchy are often used to find solutions for generic 
instances of parameters in a coefficient-parameter homotopy. 


We will not address the “polynomial products” of Figure 8.4 here; for 
this, see [MSW95]. We introduce the Newton polytopes in the following two 
sections. 

For the relation between Newton polytopes and resultants, see Chapter 7. 
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8.2.4 Polyhedral homotopies to glue real solutions 


The purpose of this section is to introduce Newton polytopes and polyhedral 
homotopies, but without mixed volumes. So we restrict ourselves to polyno- 
mials in one variable. Instead of “just” solving a polynomial in one variable, 
we consider a different problem: 


Input: & distinct monomials in one variable z: 
xe ,..., 0%, with a; A a; for i Fj. 

Output: coefficients Ca,,Caz,--+;Ca, Such that 
F(e) = ca, @™ + Ca U™ +++++ Ca, 0% 
has & — 1 positive real roots. 


For example, take 1,2°,x7,2'! as monomials on input. Then the problem is 


to find co, ¢5, c7, and cy; such that f(x) = col+e¢5x2° +c72" +¢1,21" has three 
positive real solutions. We will show that we can reduce this four dimensional 
problem in that of one dimension, considering the homotopy 


h(z,t) =t-—2° +2’ -—a"t=0, fort>0. 


The alternation of signs in the coefficients is a deliberate choice to maximize 
the number of positive real roots. The Newton polytope of a polynomial is 
the convex hull of the exponent vectors of those monomials appearing with 
a nonzero coefficient. The choice of powers of t with each monomial is such 
that the lower hull of the Newton polytope of h contains among its vertices 
all exponents of the given monomials, see Figure 8.5. 


(0,1) (11,1) 


(5,0) (7,0) 


Fig. 8.5. The Newton polytope of the homotopy h(x,t) = 0 is spanned by by the 
exponent vectors of the monomials in h. The lower hull of the Newton polytopes is 
drawn in solid lines. 


At t = 0, the homotopy A(z,0) = —2° +27 = 2°(—1+ 27) = 0 has one 
positive real root: x = 1. The idea is to choose t = At > 0 such that Newton’s 
method applied to h(x, At) = 0 converges quadratically to a positive real root 
starting at x = 1. (Notice that by the fortunate choice of the powers of t in 
the example, At can be chosen arbitrarily large as h(1,t) = 0, for any value 
of t.) 

Observe that the monomials in h(x,0) correspond to the lowest middle 
edge on the lower hull of the Newton polytope of fh in Figure 8.5. For every 
edge of the lower hull of the Newton polytope we will use one homotopy to 
find one positive real root. Each time, the start system in the homotopy has its 
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two monomials as vertices of an edge of the lower hull. To find the homotopies 
with the other two edges, we need to consider the vectors orthogonal to the 
edges (we call those vectors inner normals), see Figure 8.6. 


U1 U3 


(0,1) V2 (11,1) 
(5,0) (7,0) 


Fig. 8.6. Inner normals 11 = (#,1), v2 = (0,1), vs = (—}, 1) on the edges of the 
lower hull of the Newton polytope of the homotopy h(a, t) = 0. 


The inner normal v; attains the minimal inner product with those vertices 
on the first edge of the lower hull. Consider the four values of the inner product 
of v; with the four vertices of the lower hull: 


(1) -{(0,1), (5,0), (7,0), (11, 1)}) = {ut us =}. 


Indeed, the minimal values occur with the first two vertices which span the 
first edge. This geometric construction motivates the following change of co- 
ordinates: let 2 = yt/>, we obtain 


Aly, t) = yt ae yt _ yp gls/e (8.8) 
ad (1 — yh 4 yt2/5 — oe (8.9) 


We see that $h(y,0) = 1—y° = 0 has one positive real root: y = 1. Now we 
can choose t = At > 0 such that Newton’s method converges quadratically to 
a positive real root starting at y = 1. Let y*: h(y*, At) = 0, then we find the 
corresponding root in the original coordinates as «* = y*(At)!/°. 

We can even explicitly construct the fractional power series using Newton’s 
method in a computer algebra system like Maple. The following sequence of 
Maple commands achieve this: 


> h := t-x75 + x77 - x7 (11) *t: 

> hy := subs(x = y*t*(1/5),h): 

> hyt := simplify(hy/t): 

> newton := x -> x - subs(y=x,hyt/diff(hyt,y)): 
> xf0] := 1: 

> for k from 1 to 6 do 

> x([k] := newton(x[k-1]): 

> s[k] := series(x[k] ,t=0,15): 

>  lprint(op(1,s[k]-s[k-1])); 

> end do: 


The output of the loop (done in Maple 9) shows the errors between two 
consecutive series expansions: 
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1 

-301/15625*t~2 

-84/3125*t~2 

-2112/1953125*t* (18/5) 
-32768/152587890625*t ~ (32/5) 
-2147483648/23283064365386962890625*t~ (64/5) 


We observe the quadratic convergence, typical for Newton’s method. While the 
particular values for the errors shows above may differ on other platforms with 
different versions of Maple, the computed fractional power series expansion is 
“exact”, here we see the series up to third order: 


> series(x[6] ,t=0,3); 


2/5 4/5 6/5 8/5 2 11/5 

t t 34 t 266 t 11284 +t t 
1+ coro + cece + con === == a a 

5 5 125 625 15625 5 
12/5 13/5 14/5 
100947 t 14 t 12 t 3 
$ SSeS SSS-585> 2 Sesoas= + SS55S=== + O(t ) 
78125 25 5 


To find the third positive real root, we proceed in a similar fashion, using 
the third inner normal v3 = (—1/4,1) in the coordinate change x = yt~'/4. 
As it turns out, we can take At quite large. For At = 0.1, h(x,0.1) = 0 has 
the following three positive (approximate) real roots: 0.73, 1.0, and 1.56. As 
At grows larger, the real roots collide into multiple roots before escaping to 
the complex plane. 


Exercise 8.2.6. Compute the fractional power series for the third positive 
real root, using Newton’s method like shown above. Make sure enough terms 
in the series expansions are used so that the quadratic convergence is obvious. 


In numerical implementations of polyhedral homotopies, we only use the 
first term of the fractional power series (also known as Puiseux series). The 
connection between these fractional power series and Newton polygons is clas- 
sical for polynomials in two variables, see for example [Lef53] or [Wal62]. The 
generalization to systems of equations can be found in [McD02]. 

Using Newton polytopes to construct real curves and hypersurfaces with a 
prescribed topology is done by Viro’s method [IS03, IV96]. This homotopy to 
glue real roots can be generalized to the case of complete intersections by the 
use of mixed subdivisions, see [Stu94b, Stu94c]. We will define these mixed 
subdivisions in the next section. We apply these co-called polyhedral homo- 
topies to solve generic polynomial systems with given fixed Newton polytopes. 
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8.2.5 The Cayley trick and Minkowski’s theorem 


Mixed volumes were defined by Minkowski who showed that the volume of a 
linear combination of polytopes is a homogeneous polynomial in the factors 
of of the combination. The coefficients of this polynomial are mixed volumes. 
We will visualize this theorem on a simple example by the Cayley trick. 

The Cayley trick [GKZ94, Proposition 1.7, page 274] is a method to rewrite 
a certain resultant as a discriminant of one single polynomial with additional 
variables. The polyhedral version of this trick as in [Stu94a, Lem. 5.2] is due 
to Bernd Sturmfels. See [HRS0O] for another application of this trick. 

Consider the following system: 


f = (fi, fe) A = (Ai, Az) 
_ ret +2123 +1=0 A, = {(3,1), (1, 2), (0,0)} 
— vey + %1%2 +1=0 Ao = {(4,0), (1,1), (0,0)} 


The sparse structure of f is modeled by the tuple A = (A;,A2), where A, 
and A» are the supports of f; and fo respectively. The Newton polytopes are 
the convex hulls of the supports. The Cayley polytope of r polytopes is the 
convex hull of the polytopes placed at the vertices of an (r — 1)-dimensional 
unit simplex. Figure 8.7 illustrates this construction for our example. 


(0,0,1) 
_ aia 
401) Sy LD 
(0,0,0) 
(3,1,0) (1,2,0) (3,1,0) (1,2,0) 


Fig. 8.7. The Cayley polytope of two polygons. The first polygon is placed at the 
vertex (0,0,0), the second polygon is placed at (0,0, 1). 


For our example, the Cayley polytope is so simple that a triangulation 
is obvious (see Figure 8.8). As every simplex has four vertices, either the 
simplex has three vertices from the same polygon (and the fourth one of the 
other polygon), or the simplex has two vertices of each polygon. A simplex 
of the first type is called unmixed, a simplex of the second type is mixed. 
Imagine taking slices parallel to the base of the Cayley polytope. These slices 
produce scaled copies of the original polygons in the unmixed simplices. In 
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the mixed simplex we find one scaled edge from the first and another scaled 
edge from the second polygon, see Figure 8.8. 


(3,1,0) (1,2,0) (1,2,0) 42,0) 


Fig. 8.8. A triangulation of the Cayley polytope. The middle simplex is mixed, the 
other two simplices are unmixed. 


On Figure 8.9 we see in the cross section of the Cayley polytope a mixed 
subdivision of the convex combination A; P; + A2P2, Ay + A2 = 1, A1 > 0 and 
A2 > 0, where P, defines the base and P is at the top of the polytope. The 
areas of the triangles in the cross section are \7 x area(P;) and \3 x area(P2), 
as each side of the triangle is scaled by \1 and A2 respectively. The area of the 
cell in the subdivision spanned by one edge of P; (scaled by 41) and the other 
edge of P2 (scaled by Az) is scaled by A, x Az, as we move the cross section. 


(1,2,0) 


(3,10) (1,2,0) (3,1,0) 


Fig. 8.9. A mixed subdivision induced by a triangulation of the Cayley polytope. 


In Figure 8.10 we show the Minkowski sum of the two polygons P; and P», 
with their mixed subdivision corresponding to the triangulation of the Cayley 
polytope. For this example, Minkowski’s theorem becomes 
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area(A; P, + Ag P2)=V(P,, P,)A? cag V (Pi, P2)AiA2 + V (Po, P)d3 


8.10 
=3)? + 8, do + 202. 0) 


The coefficients in the polynomial (8.10) are mixed volumes (or areas in our 
example): V(Pi, P,) and V(P2, P2) are the respective areas of P, and Py, 
while V(P,, P2) is the mixed area. 


(1,2)4(2,1) 
ss (0,0)+(1,2) § oe 
: P, aac (3,1) (1,1) : : ‘ Z 2 ee 
ae : 1A2 : wo0et (3,1) +(4,0) 
ou ° eno 
(0"0) (0,0) (4,0) (0,0)-+(0,0) (0,0)+(4,0) 


Fig. 8.10. A subdivision of the sum of two polygons P; and P2. The sum is the 
convex hull of all sums of the vertices of the polygons. The cells in the subdivision 
are labeled by the multipliers for the area of Ai Pi + A2P2. 


The subdivisions we need are induced by a lifting. Such subdivisions are 
called regular, they define polyhedral homotopies. For the example, the lifted 
supports are A = (Aj, Ag), with 


A, = {(3,1,1), (1,2,0), (0,0,0)} and Ay = {(4,0,0), (1, 1,1), (0,0, 0)}. 


Figure 8.11 shows the mixed subdivision of Figure 8.10 as induced by the 
lower hull of the sum of the lifted polytopes. 


(1,2,0)+(1,1,1) 


0.8 (3,1,1)+(4,0,0) 
0.4 ee 
: : 
2 oA a) 
1 Lo on 
——————————— 
) : ; 5 3 7 


Fig. 8.11. A mixed subdivision is regular if it is induced by a lifting. 


As there is only one mixed cell in the mixed subdivision of the Newton 
polytopes of our example, there is only one homotopy to consider, for example: 


3 2 
_ j ejvet+a,%75+1=0 
nest) — { xt + £1 X9t + 1=0 tt) 


The powers of the ¢ in h(x,t) = 0 are the lifting values of the supports which 
induced the mixed subdivision shown in Figure 8.11. 


320 A.J. Sommese, J. Verschelde, and C.W. Wampler 


Exercise 8.2.7. Verify that the start system h(x, t = 0) = 0 in the polyhedral 
homotopy (8.11) has indeed eight (= V(P1, P2)) regular solutions. Show that 
any system with exactly two monomials in every equation has always as many 
regular roots as its mixed volume, for any nonzero choice of the coefficients. 


8.2.6 Computing mixed volumes and polyhedral continuation 


In the previous subsections we introduced polyhedral continuation and mixed 
volumes. With these two concepts we can state and prove Bernshtein’s first 
theorem. As the way we compute mixed volumes determines the way we solve a 
generic system, this section presents two different methods to compute mixed 
volumes. The first technique relies on the Cayley trick and computes all cells 
in a mixed subdivision. The second method uses linear programming and leads 
to an efficient enumeration of all mixed cells in a mixed subdivision. 

With the Cayley trick we can obtain a regular mixed subdivision as a 
regular triangulation of the Cayley polytope. We next introduce a method to 
compute a regular triangulation of any polytope. Our method will construct 
the triangulation incrementally, adding the points one after the other. The key 
operation is to decompose one point with respect to one simplex. Consider 
for example the simplex [co,c¢1,¢2] spanned by co = (0,0), c1 = (3,2), and 
co = (2,4). If we take one extra point, three possible updates can occur, 
illustrated by Table 8.2. 


point barycentric decomposition pivoting 
x= (2,3)) x =+ 3 Co + + cy + 2 C2 no new simplex 
y=(5,1): y=-—Feot$er— § ca ly,e1,cal[eo,cr,y] 
z=(1,5): z=+¢e0-2a0+ 2 e& [Co, Z, C2] 


Table 8.2. Three possible updates of the simplex [co,¢1,¢2] with one point, x, y, 
or z. Either we have no, two, or one new simplex by interchanging the vertex with 
negative coefficient with the point. 


Solving a linear system we can write any point as a linear combination of 
the vertices of a simplex, requiring the coefficients in that linear combination 
to sum up to one. We call this linear combination a barycentric decomposi- 
tion of a point with respect to a simplex. The negative signs of the coefficients 
in this barycentric decomposition tell which vertices of the simplex to inter- 
change with the new point to create new simplices in the triangulation of the 
convex hull of the original simplex and the point. As we can see from Fig- 


8 Introduction to numerical algebraic geometry 321 


Fig. 8.12. Pivoting to obtain a regular triangulation of a polygon. The construction 
on the right shows how the triangulation can be obtained as the lower hull of y and 
z lifted at height one, with [co,¢1, cz] sitting at level zero. 


ure 8.12, any triangulation obtained by placing points (see [Lee91] for more 
on triangulations) in this way is regular. 

The algorithm to compute regular triangulations incrementally leads to 
an incremental polyhedral solver, which solves polynomial systems adding one 
monomial after the other, see [VGC96]. If the structure of a polynomial system 
is such that most polynomials share the same support (or more generally span 
the same Newton polytope), and thus there are only few distinct Newton 
polytopes to consider, then the Cayley trick is not too wasteful. 

The complexity of computing volumes and mixed volumes is discussed 
respectively in [DF88] and [DGH98}. 


Theorem 8.2.8. (Bernshtein’s theorem A) The number of roots of a 
generic system equals the mized volume of its Newton polytopes. 


In his proof of this theorem, Bernshtein [Ber75] used a homotopy (imple- 
mented in [VVC94]), based on a recursive formula for computing mixed vol- 
umes. This proof idea was generalized by Huber and Sturmfels in [HS95]. 
Note that the theorem concerns “generic systems”, which are systems with 
randomly chosen coefficients. These generic systems serve as start system in a 
coefficient-parameter homotopy to solve any specific polynomial system with 
the same Newton polytopes. 

For the coordinate changes in the polyhedral homotopies, we need to know 
the inner normals to the mixed cells. Therefore, we use a dual representation 
of polytopes, see Figure 8.13. The normal fan of a polytope is the collection 
of the normal cones to all faces of the polytope. The normal cone to a face 
contains all inner normals which define the face. 
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(0,1) 
© N(A) N(P2) 
(1,2) . 
S., o., 
* 75 Me fhe, 14 
£OPL fee 3.1) = (2,-1) i 
2 eer 2 /P; —~ -1,-3 
(090) (-1,-2) (0:0) Zo OG) 


Fig. 8.13. Two polygons P, and Py» and their normal fans, N(P,) and N(P2). The 


labels corresponding to the edges in the fans are inner normals to the corresponding 
edges of the polygons. 


We are only interested in the mixed cells of a mixed subdivision, and in 
particular, the inner normal to those lower facets of the Minkowski sum which 
define the mixed cells. Figure 8.14 illustrates that the inner normal to a mixed 


cell lies in the intersection of the normal cones to the edges which span that 
mixed cell. 


Fig. 8.14. The dual representation of a mixed subdivision. 


The search for all inner normals to the mixed cells in a mixed subdivision 
naturally leads to a system of linear equalities and inequalities. For a tuple of 
n supports (Aj, A2,...,A,), consider an edge of the kth polytope, spanned 
by {a,b} C A;. Then the inner normal v to this edge satisfies 


(c,v), for all c € Ag. (8.12) 


Enumerating all edges of a polytope is thus equivalent to enumerating all 
feasible solutions to the system (8.12). Letting k range from 1 to n in (8.12) 
applied to the lifted point sets Ak provides the dual linear-programming model 
to enumerate all inner normals to the mixed cells in a regular mixed subdivi- 
sion. 

A lift-and-prune strategy to enumerate all mixed cells in a regular mixed 
subdivision was proposed in [EC95] and dualized in [VGC96]. Recently, insight 
in the linear programming methods has led to very efficient calculations of 


mixed volumes, as developed in [DKK03], [GLO0, GL03], [KK03b], [LLO1], 
and [TKF02]. 
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8.2.7 Bernshtein’s second theorem 


When tracing solution paths diverging to infinity, one may wonder when to 
stop. After all, infinity is pretty far off, and even if good knowledge of the 
application domain gives us good bounds on the size of the solutions, we do 
not want to miss valid solutions with large components. If a path seems to 
diverge, we must know whether we have true divergence or convergence to a 
root with large components. Bernshtein’s second theorem {Ber75] will provide 
us with a certificate of divergence. 

For a system f(x) = 0, supported by A = (Aj, Ao,..., An), we can write 
its equations f = (fi, fo,..-, fn) as 


The Newton polytopes of f are denoted by P = (Pi, Po,..., Pr), with P; := 
conv(A;), ¢ = 1,2,...,. Then for any w 4 0, we define the tuple of faces 
OuP = (0..P1, 0, Pr, cog aby OwPrn), as OP; = conv (0.,Ai), with 


OWA; = { a€ A; | (ayw) = min (a’,w) }. (8.13) 


The set 0,,A; is the support of the face of the ith polynomial f;: 


We write 0, f = (Ou fi, Ou fo,.--,Oufn) as the face of the system f determined 
by w #0. The mixed volume of P is denoted by V(P) and C* = C \ {0}. 


Theorem 8.2.9. (Bernshtein’s theorem B) If Vw 4 0, 0, f(x) = 0 has 
no solutions in (C*)", then V(P) is exact and all solutions are isolated. Oth- 
erwise, for V(P) £0: V(P) > #isolated solutions. 


Interestingly, the Newton polytopes may often be in general position, i.e.: 
V(P) is exact for every nonzero choice of the coefficients. Consider for example 
the following system: 


f(x) = C111%1%2 + C1101 + C101 L2 + C100 = 0 
= 2,2 = 
C22277{ 25 + Co19X1 + Co91L2 = O 


We show the tuple of Newton polytopes in Figure 8.15. 


Exercise 8.2.10. Verify that the mixed volume V(P;, P2) of the polygons P; 
and Py» is indeed equal to four. 


While the observation in Figure 8.15 would let us believe that the mixed 
volume always provides a sharp root count, we have to keep in mind that 
the vertices of the polytopes are not randomly chosen. The vertices occur as 
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(2,2) 


(0,1) (1,1) 


(0,0) (1,0) (1,0) 


(0,1) 


Fig. 8.15. Two Newton polygons in general position: Vw £ 0: 0,Ai1+0,A2<3=> 
V (Pi, P2) = 4 is always exact, for all nonzero choices of the coefficients of f, because 


we need at least four monomials for 0., f(x) = 0 to have all its roots in (C*)?. 


the exponents in the polynomials. For instance, general Newton polytopes are 
almost never simplicial, we usually find k-dimensional faces spanned by far 
more than k + 1 vertices. 

Following Bernshtein we look at what happens when we consider the solu- 
tion paths in a homotopy going from a generic to a specific polynomial system. 
At the limit of the paths, we look at the power series expansion, using the 
following result. 


Theorem 8.2.11. Vx(t), h(x(t), 
s>0,mEN\ {0}, w EZ": 


(1 — t)g(x(t)) + tf(x(t)) = 0, 
s)= i Vp GS 1 Dyce rt 
j= fort ¥ 1,5 ~0 


on 
5 ( 


WwW 


t( 


The number m is called the winding number of the solution at the end of 
the path (not to be confused with the multiplicity). The winding number is 
the smallest number so that z(27m) = 2(0), if we consider z(@) a solution 
path of h(z(6),t(@)) = 0, winding around 1 with values for the continuation 
parameter t defined by t = 1+ (tp — 1)e”, as tp & 1. 

At the end of a path, when does lim x;(t) € C*? From Theorem 8.2.11, we 
can characterize the divergence of the path x(t) by the leading exponents w 
in the power series: 


> CO <0 
a(t)< €C* & wi =0 
— 0 >0 


From this simple observation we see that a solution at infinity and a solution 
with zero components are regarded (or disregarded) equally. 

Next we show the relation between face systems and power series. Assum- 
ing lim xi(t) ¢ C*, and w; 4 0, we consider a diverging path. 

First we substitute the power series 7;(s) = bjs“*(1+O(s)),7=1,2,...,n 
t(s) = 1—s™,s = 0 into the homotopy h(x,t) = (1 — t)g(x) + tf(x) = 0. We 
find 


dominant as s—0 
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Thus (as expected), the choice of the start system g(x) = 0 plays no role in 
what happens as s approaches zero. Let us now see what the substitution does 
to the ith polynomial: 


filx) = So ciax® > fi(x(s)) = S5 cia |] bf s'* (1 +0(s)). 


acA; acA; w=1 


0. fi(x(s)) dominant 


Arranging the monomials in f(x(s)) in increasing order of powers of s, we see 
that the monomials that become dominant as s — 0 have exponents whose 
inner product is minimal with w. Recall that we characterize these exponents 
by the face of the support A; in the direction of w, see (8.13). Moreover, as 
fi(x(s)) = 0 for s > 0, we see from the result of the substitution that then 
0. f;(b) = 0, and thus 0,, f(b) = 0 for some b € (C*)”. 

This is the key idea in the proof of Bernshtein’s second theorem. Like his 
first theorem, his idea is very constructive: follow the direction of a diverging 
path and (in addition to a solution at infinity) we find a face system which has 
solutions in (C*)”. This face system forms a certificate for the mixed volume 
to overshoot the actual number of roots. 

That Richardson extrapolation is useful to find w is not so surprising. A 
closer inspection of the errors of the error expansion reveals that a similar 
extrapolation scheme can be applied to approximate the winding number m. 

As we get closer to our target system, we have to decrease our step size 
when dealing with a difficult path. For the purpose of extrapolation, we better 
decrease the step size geometrically, i.e., for some A, 0 < A < 1, consecutive 
values to, t1,...t, of the continuation parameter t satisfy 1—t, = A(1—t,) = 
--» = \*(1 — to) and for the corresponding sequence of s-values we have 
Sk = AV ™sy_y = + = M/S. 

Recall the form of the power series for a solution path x(s) for s ap- 
proaching zero: x;(s) = b;s”‘(1 + O(s)) with t(s) = 1— s™. Sampled along 
80; $1,---;Sk, we obtain 


ai(sp) = b,dAPt/™ 9 (1 + O(AF/™ 59)). (8.14) 


Since we are interested in the leading powers w;, we take the logarithms of 
the magnitudes of the points sampled along the path: 


kw — 
log |xi(sx)| = log |bi| + —* log() + w; log(so) + log }1 + J U5(A‘/" s0)4) 
j=0 


A first-order approximation for w; is given by vgx41 with the general extrap- 
olation formula in vp.7: 


Uk+1..1 — Uk..1-1 
1-A 


Uek+1 *= log |xi(s~ + 1)| — log |xi(se)|,  Ve..0 = Ue.t-1 4 
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which results in w; = myygexy + O(sp). While we can make the order r of 
the extrapolation as high as we like (thereby increasing the accuracy of w;). 
Notice that the formula assumes we know the winding number m. 

If we examine the expansion of the errors: 


el") = (log |ari(se)| — log |2ri(sx+1)]) (8.15) 
— (log |2ri(se-41)| — log |ari(se-42)l) (8.16) 
= cA*/™ 39(1 + O(A*/™)), (8.17) 


we find similar extrapolation formulas to approximate m: 


elt) _ (kt 1d) 
geen) i log(ef*+) _ log(es*), ek) es _ &i = i 
— Ak... 


oy T 


with Ap. 7 = AC-R-Y/™e..1, So we obtain mp... = El) SOO ahh) 


The system of Cassou-Nogués is a very nice example. It illustrates how 
symbolic results can be obtained by purely numerical means. 


f(b, ¢,d,e) = 
15b*cd? + 6b4c? + 21b*c?d — 144b?c — 8b?c?e 
—28b? cde — 648b7d + 36b7d7e + 9b4d? — 120 = 0 
30c3b4*d — 32de?c — 720db2c — 24c?b?e — 432c*b? + 576ec 
—576de + 16cb?d7e + 16d?e? + 16€?c? + 9c*b* +5184 
+39d7b4c? + 18d°b*c — 432d?2b? + 24d%b?e — 16c?b?de — 240c = 0 
216db?c — 162d7b? — 81c7b? + 5184 + 1008ec — 1008de 
+15c¢?b?de — 15c3b?e — 80de?c + 40d7e? + 40e?c? = 0 
261 + 4db?c — 3d?b? — 4c7b? + 22ec — 22de = 0 


Root counts: D = 1344, B = 312, V(P) = 24, but there are only 16 finite 
roots. 
—8b?c2e — 28b?cde + 36b7d?e = 0 
—32de?c + 16d7e? + 16e7c? = 0 
9(0,0,0,-1) f (0, c, d, e) = —80de2c + 40d2e2 + 40e2c? = 0 
22ec — 22de = 0 


The winding number is m = 2. See [HV98] for more about polyhedral end 
games. 


8.3 Homotopies for positive dimensional solution sets 
To introduce the numerical representation of positive dimensional solution 


sets, we start off with a dictionary, linking concepts in algebraic geometry 
to data and algorithms in numerical analysis. Witness sets form the central 
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data and are obtained by a cascade of homotopies. The companion algorithms 
to the witness sets are membership tests to decide whether any given point 
belongs to a certain component of the solution set. We illustrate a numerical 
irreducible decomposition on a simple example and give an overview of our 
numerical factorization methods. 


8.3.1 A dictionary 


Kempf writes in [Kem93] that “Algebraic geometry studies the delicate bal- 
ance between the geometrically plausible and the algebraically possible”. With 
our numerical tools, we feel closer to the geometrical than to the algebraic 
side, because we are not calculating with polynomials in the algebraic sense. 
In [SVW03] we outlined the structure of a dictionary, presented as Table 8.3. 


Numerical Algebraic Geometry Dictionary 
Algebraic example Numerical 
Geometry in 3-space Analysis 
variety collection of points, {polynomial system 
algebraic curves, and |+ union of witness sets, see below 
algebraic surfaces for the definition of a witness point 
irreducible a single point, or {polynomial system 
variety a single curve, or |+ witness set 
a single surface + probability-one membership test 
generic point random point on  |point in a witness set; a witness point 
on an an algebraic is a solution of the polynomial system on 
irreducible curve or surface the variety and on a random slice whose 
variety codimension is the dimension of the variety 
pure one or more points, or]polynomial system 
dimensional Jone or more curves, or|]+ set of witness sets of same dimension 
variety one or more surfaces |+ probability-one membership tests 
irreducible several pieces polynomial system 
decomposition of different + array of sets of witness sets and 
of a variety dimensions probability-one membership tests 


Table 8.3. Dictionary to translate algebraic geometry into numerical analysis. 


8.3.2 Witness sets and a cascade of homotopies 


A witness set is the basic concept of numerical algebraic geometry as it allows 
us to apply numerical methods for isolated solutions to positive dimensional 
solution components. 

Every irreducible component of a solution set is presented by a witness set 
whose cardinality equals the degree of the irreducible component. To reduce 
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a solution set of dimension k to a set of isolated points, we cut the k degrees 
of freedom by adding & random hyperplanes L(x) = 0 to the system f(x) = 0 
which defines the entire solution set. 

One obstacle is that we have to deal with systems whose number of equa- 
tions in not necessarily the same as the number of unknowns. If there are 
fewer equations than unknowns, we simply add enough random hyperplanes 
to make up for the difference, so underdetermined systems are easy to handle. 

Let us consider overdetermined systems, say f consists of 5 equations in 3 
variables. To turn f into a system of N equations in N variables where N is 
either 3 or 5, we can respectively apply the following techniques: 


randomization: Choosing random complex numbers a;;, we add random com- 
binations of the last two polynomials to the first three polynomials: 


fi(x) + 11 fa(X) + G12 fs(x) 
fo(x) + aa1 fa(x) + G22 fs(x) 
fa(x) + asi fa(x) + a32f5(x) 


slack variables: We introduce two new variables z, and z2 (so-called slack 
variables) and add random multiples of these variables to every equation: 


0 
0 
0 


I 


fi(x) + a1121 + a1222 = 0 
fo(X) + a2121 + A222 = 0 
f3(x) + a3121 + a32z2 = 0 
fa(x) + @a121 + Ga2z2 = 0 
fs(x) + @5121 + a522z2 = 0 


While the randomization technique might seem at first more attractive be- 
cause we are left with fewer equations, working with slack variables provides 
a cascade of homotopies to compute candidate witness points on all positive 
dimensional components. 

In particular, considering f, and fs as hyperplanes L; and Lz to cut the 
solution set of the first three equations in f, we consider a cascade of three 
systems. To get witness points on the two dimensional solution sets, we first 
solve 


fi(x) + a1121 + a1222 = 0 
fo(x) + a2121 + a2222 = 0 
f3(x) + a3121 + 43222 = 0 
L(x) Tey 0 
L(x) TT 22> 0 


Solutions with z; = 0 and z2 = 0 define witness points on the two dimensional 
solution components. Solutions with z; 4 0 and z2 # 0 provide start points 
in the homotopy which removes Lz from the system, which leads to the next 
system in the cascade: 
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fix) 44121 + 44222 = 0 
fo(x) + a2121 + a2222 = 0 
f3(x) + 43121 + a32z2 = 0 
L(x) +4 = 0 

22 = 0 


The paths defined by this move end at witness points on the one dimensional 
components, picked out by z; = 0. Solutions with z; 4 0 are used in the 
homotopy which removes L, to lead to the isolated solutions of the system. 
The last system in the cascade is 


fi (x) T 44121 = 0 
fo(x) +r 42121 = 0 
f3(x) +T 431271 = 0 
21> 0 
22> 0 


In the next section we give a specific example of this cascade. 

The idea of slicing a solution set by hyperplanes to determine its dimension 
appeared in [GH93] to prove that the theoretical complexity of this problem 
is polynomial. 


Exercise 8.3.1. Consider the adjacent minors of a general 2 x 4-matrix: 


er iuiniie die £41 €2Q2 — £91 X12 = O 
ae — | f(x) = ¢ x49%93 — ©22%13 = 0 
oe en oe £13024 — L93X14 = O 


Verify that dim(f~1(0)) = 5 and deg(f~1(0)) = 8. This is the simplest in- 
stance of a general family of problems introduced in [DES98], see [HSO0] for 
special decomposition methods. 


8.3.3 A probability-one membership test 


A probability-one membership test determines whether a given point p lies 
on a pure dimensional solution set. Suppose we have witness points defined 
by a polynomial system f(x) = O and hyperplanes L(x) = 0. A homotopy 
method implements the probability-one membership test: 


1. Define K(x) = L(x) — L(p). As K(p) = 0, the hyperplanes K’ pass 
through p. 
2. Consider the homotopy 


h(x, t) = es (1-8 + (i = 0. 


At t = 1 we start tracking paths at the witness set and find their end 
points at t= 0. 
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3. If p belongs to the solution set of h(x,0) = 0, then it is also a witness 
point of the pure dimensional solution set. 


Notice that this test does not move the point p, which may be a highly singular 
point. This observation is important for the numerical stability of this test. 
The test is illustrated in Figure 8.16. 


Fig. 8.16. Illustration of a probability-one membership test using a homotopy. The 
homotopy moves the line L of the witness set for f~1(0) to the line K, which passes 
to the test point p. As none of the witness points on K equals p, p ¢ f~*(0). 


8.3.4 A numerical irreducible decomposition 


Consider the following example: 


(a1 — 1)(a2 — 27) =0 
f(x) = 4 (a1 — 1)(a3 — 23) =0 
(xi — 1)(x2 — x7) =0 


From its factored form we see that f(x) = 0 has two solution components: the 
two dimensional plane x; = 1 and the twisted cubic { (x1, 22,23) | @2-— a2} = 
0, 73-23? =0 }. 

To describe the solution set of this system, we use a cascade of homotopies, 
the chart in Figure 8.17 illustrates the flow of data for this example. 

Because the top dimensional component is of dimension two, we add two 
random hyperplanes to the system and make it square again by adding two 
slack variables z; and za: 


1 q) Q41241 414222 = 0 
3 _ 
(x1 — 1)(%3 — 2{) + @9121 + Ag2z2 = 0 
3 _ 
1 1) + 43121 + a32z2 = 0 
C10 + C1121 + C1222 + €13%3 + 21 = 0 
C29 + €21%1 + C22%2 + €93%3 + 2 = 0 
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where all constants aj;,7 = 1,2,3, 7 =1,2, and cy, k = 1,2, 1 =0,1,2,3 are 
randomly chosen complex numbers. Observe that when z; = 0 and z2 = 0 the 
solutions to e(x, 21, 22) = O satisfy f(x) = 0. So if we solve e(x, 21, 22) = 0 
we will find a single witness point on the two dimensional solution component 
zx, = 1 as a solution with z; = 0 and zg = 0. Using polyhedral homotopies, 
this requires the tracing of six solutions paths. 

The embedding was proposed in [SV00] to find generic points on all positive 
dimensional solution components with a cascade of homotopies. In [SV00] 
it was proven that solutions with slack variables z; # 0 are regular and, 
moreover, that those solutions can be used as start solutions in a homotopy 
to find witness points on lower dimensional solution components. At each stage 
of the algorithm, we call solutions with nonzero slack variables nonsolutions. 

In the solution of e(x, 21, 22) = 0, one path ended with z; = 0 = 2g, the 
five other paths ended in regular solutions with z; £ 0 and z2 # 0. These five 
‘“nonsolutions” are start solutions for the next stage, which uses the homotopy 


ho(x, 21; 20, t) 


(a1 — 1)(x2 — 7) Faiz + a1222 = 0 
(a1 — 1)(@3 — ©2) + ao1 21 + 2222 = 0 
= (a? — 1)(@2 — xf) + ag1z1 + a3222 = 0 


C19 + C1141 + Cy2%q + €43%3 + 21 = 0 
z(1 —t) + (coo + ca1t1 + Co2%2 + C2343 + 22)t = 0 


where t goes from one to zero, replacing the last hyperplane with z. = 0. 
Of the five paths, four of them converge to solutions with z; = 0. Of those 
four solutions, one of them is found to lie on the two dimensional solution 
component x; = 1, the other three are generic points on the twisted cubic. As 
there is one solution with z; 4 0, we have one candidate left to use as a start 
point in the final stage, which searches for isolated solutions of f(x) = 0. The 
homotopy for this stage is 


(Ga = 1)(x2 = 2) +T A412, = 0) 
(Gan = 1)(a3 = a9) +r 42121 = 0 
) 4 
x 


h t) = 
i(X, 21; ) (x? —1)(xe — 27 + 3121 = 0 


zi(1 —t) + (cro + C1141 + C12%2 + €13%3 + 21)t =0 


which as t goes from 1 to 0, replaces the last hyperplane z, = 0. At t = 0, the 
solution is found to lie on the twisted cubic, so there are no isolated solutions. 

The calculations are summarized in Figure 8.17. The breakup into irre- 
ducibles will be explained in the next section. 


8.3.5 Factorization methods 


A recent trend in computer algebra is the adaptation of symbolic methods 
to deal with approximate input data, which leads to the use of hybrid meth- 
ods [CKW02]. One such problem is the factorization of multivariate poly- 
nomials, listed as a challenge in [Kal00]. Recent papers on this problem are 
[(CGvHt01, CGKW02], [GRO1, GR02], [HWSZO00], and [Sas01]. 


Hodor 


bo 
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WitnessGenerate WitnessClassify 


Path following Filter Points Breakup into 


Irreducibles 


C Homotopy + Start Solutions) 


4 
6 paths | — + | 0 at infinity 


1 solutions —+ | 1 to classify | ——+] lonam=1 


a 5 nonsolutions Append to Filter 


5 paths | —+ | 0 at infinity 


Wi lon r= 1 = Ji 
4 solutions —— > 


3 to classify | ——+ | 3 on cubic 
1 nonsolution Wi 


Append to Filter 


1 path | —~+| 0 at infinity | 
aC Wo Oonz,=1 
1 solution —> —_ ——————— |p = Jo 
1 on cubic 


Fig. 8.17. Numerical Irreducible Decomposition of a system whose solutions are 
the 2-dimensional plane x; =_1 and the twisted cubic. At level 2, for i = 2,1,0, we 
filter candidate witness sets W; into junk sets J; and witness sets W;. The sets W; 
are partitioned into witness sets W;; for the irreducible components. 


Monodromy to partition witness point sets 


We can see whether a curve factors or not by looking at its plot in complex 
space, i.e.: we consider the curve as a Riemann surface. Figure 8.18 was made 
with Maple (see [CJ98] for instructions). 

Looking at Figure 8.18, imagine a line which intersects the surface in three 
points. Taking one complete turn of the line around the vertical axis z = 0 
will cause the points to permute. For example, the point which was lowest will 
have moved up, while another point will have come down. Such a permutation 
can only happen if the corresponding algebraic curve is irreducible. 


a War 


= Wi 
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Re(z1/3) 


Im(z) 
Fig. 8.18. The Riemann surface of z? — w = 0. The height of the surface is the 


real part of w = z!/°, while the gray scale corresponds to the imaginary part of 
gray ginary 


w = z'/3, Observe that a loop around the origin permutes the order of points. 


Based on this observation, we can decompose any pure dimensional set 
into irreducible components. Our monodromy algorithm returns a partition 
of the witness set for a pure dimensional component: points in the same subset 
of the partition belong to the same irreducible component. Recall that witness 
points are defined by a system f(x) = 0 and a set of hyperplanes L(x) = 0. 
With the homotopy 


icelot) =>( £1 ) (1-2¢)+ Ga Lec 


we find new witness points on the hyperplanes K(x) = 0, starting at those 
witness points satisfying L(x) = 0, letting t move from one to zero. Choos- 
ing another random constant  # A, we move back from Kk to L, using the 
homotopy 


hese =» (73) (1—t)+ (£8) Je=o, pec. 


The homotopies hx (x,t) = 0 and hyx«(x,t) = 0 implement one loop in the 
monodromy algorithm, moving witness points from LZ to K and then back 
from K to L. At the end of the loop we have the same witness set as the set 
we started with, except possibly permuted. Permuted points belong to the 
same irreducible component. 
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Notice that the monodromy algorithm does not know the locations of 
the singularities. See [DvH01] for the algorithms to compute the monodromy 
group of an algebraic curve in Maple (package algcurves). Using homotopies 
theoretically, the complexity of factoring polynomials with rational coefficients 
was shown in [BCGW93] to be in NC. 


Linear traces to validate the partition 


When we run the monodromy algorithm, we may not have made enough loops 
to group as many witness points as the degree of each factor, i.e.: the partition 
predicted by the monodromy might be too fine. For a k-dimensional solution 
component, it suffices to consider a curve on the component cut out by k— 1 
random hyperplanes. The factorization of the curve tells the decomposition 
of the solution component. Therefore, we restrict our explanation of using the 
linear trace to the case of a curve in the plane. 

Suppose we have three points in the plane obtained as (projections of) 
witness points from some polynomial system. If the monodromy found loops 
between those points, then we know that these points lie on an irreducible 
factor of degree at least three. Whence our question: is this irreducible factor 
on which the given three points lie of degree three? 

To answer this question we represent the factor by a cubic polynomial f 
in the form 


f(x, y(a)) = (y — yr) (y — y2(x))(y — y3(2)) 
=y? — ti(x)y? + ta(a)y — t3(2) 


Since deg(f) = 3, deg(t;) = 1, so t; is the linear trace: t1(x) = cx + cp. 
We now proceed as follows. Via interpolation we find the coefficients co 
and c,. We first sample the cubic at « = 2 and x = 2. The samples 


are {(20, Yoo), (Xo, Yor), (%o, Yo2)} and {(71, y10), (%1, 11), (%1, Y12)}. To find 
co and c, we then solve the linear system 


Yoo + Yo + Yo2 = C1Lo + Co 
Yio T Y11 TO Y12 = C121 +1 Co 
With t, we can predict the sum of the y’s for a fixed choice of x. For example, 
samples at « = x are {(2, yoo), (X2, Yor), (We, yo2)}, see Figure 8.19. 
So our test consists in computing t (#2) in two ways: 


C1 X2 + Co = yoo + Yy21 + Y22- 


If the equality holds, then the answer to our question is yes. 


Efficiency and numerical stability 


The validation with the linear trace is fast. Therefore, our implementation 
does this validation each time a new loop with the monodromy algorithm 
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Fig. 8.19. The linear trace test on a planar cubic. To find the trace we interpolate 
through the samples at x = xp and x = x. Samples at x = x2 are used in the test. 


is found. Even as we do not know the locations of the singularities, practical 
experiences on many systems all lead to a rapid finding of permutations. While 
this approach is suitable for irreducible factors of very large degree (e.g., one 
thousand), strategies based purely on traces often perform better for smaller 
degrees. 

Related to the efficiency is good numerical stability: if we can compute 
witness points with standard machine arithmetic, then we can also factor 
using standard machine arithmetic. This feature is very important when the 
accuracy of coefficients of the polynomial system is limited. 


Exercise 8.3.2. Apply phe -f to factor 


X**G - XH + QeK**5*Z - x**gKY RAD - KHKAKYRZEXKKOG KY HRS, 
- AXXO KY HKQEZ + B*X*KKOKYRZEKD — QDeRX**K3*Z*EK3 + SHXHKKQ EY *RO*Z, 
— GHXAKQKY#HAQAZHERD + BS#XHKDQHYAZHRS — X#*KQHZHHA + BeXKYHKZ*Z*D 
- Axx Y*KQEZ ERS + QHKHKYHZRKA FY *RKOKZ HKD - YREQRZ* RG 5 


which is a polynomial in a format accepted by phc. 


Exercise 8.3.3. Consider again the system of adjacent minors from Exer- 
cise 8.3.1. Determine the number of irreducible factors and their degrees. 


See Chapter 9 for more on factorization methods. 


8.4 Software and applications 


8.4.1 Software for polynomial homotopy continuation 


We agree with the statement: “It can be argued that the ‘mission’ of numerical 
analysis is to provide the scientific community with effective software tools.” 
(taken from the preface to [GVL83]). Aside from our missionary intentions, 
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software has helped us in refining our algorithms, along the lines of the quote 
(from [Knu96]): “Another reason that programming is harder than the writ- 
ing of books and research papers is that programming demands a significant 
higher standard of accuracy.” 

The software package PHCpack [Ver99a] is currently undergoing the tran- 
sition from being a toolbox/black-box for various homotopy continuation 
methods to approximate all isolated solutions to a complete solving environ- 
ment with capabilities to handle positive dimensional solution components 
efficiently, both in terms of computer operations and user manipulations. By 
the latter we hint at the search to find the right user interface, identifying the 
right data flow and trying to balance the toolbox with the black-box approach. 

While PHCpack offered the first reliable implementation of polyhedral 
homotopies, its efficiency is currently surpassed by the implementations de- 
scribed in [GLO0, GLO3, LLO1] and [DKK03, GKK* 04, KK03b, TKF02]. To 
interact better with other codes, we are currently developing an interface from 
the Ada routines in PHCpack to routines written in C. Another (but related) 
interface concerns the interaction with computer algebra software. In [SVW03] 
we describe a very simple interface to Maple. 


8.4.2 Applications 


A benchmark suite for systems with positive dimensional solution components 
is gradually taking shape. Rather than listing summaries of a benchmark, 
we choose to treat two very typical applications: the cyclic n-roots problem 
from computer algebra and a special Stewart-Gough platform from mechanical 
design. 


The cyclic n-roots problem. This problem is already interesting not only 
by its compact formulation and widespread fame in the computer algebra 
community, but also by known theoretical results concerning the number 
of isolated roots when n is prime [Haa96]. 

For n = 8, there are 16 one dimensional irreducible components: eight 
quadrics and eight curves of degree 16. While approximations to all 1,152 
isolated cyclic 8-roots were found already in the first release of PHCpack, 
monodromy was needed to factor the curve of degree 144 into irreducibles. 
To compute all witness points for the cyclic 9-roots problem, the software 
of [LLO1] was essential. While the factorization of a two dimensional com- 
ponent of degree 18 into six cubics posed no difficulty, the homotopy mem- 
bership test was required to certify that among the 6,642 isolated ones 162 
cyclic 9-roots occurred with multiplicity four. In addition, multi-precision 
arithmetic was used to confirm this result. 

The isolated cyclic n-roots (up to n = 13, for which 2,704,156 paths were 
traced) can be found on the Internet® These roots have been computed 
with PHoM [GKKtT04]. 


° nttp://www.is.titech.ac. jp/~kojima/polynomials/cyclic13. 
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A special Stewart-Gough platform. The Stewart-Gough platform is a 

parallel robot which attracted lots of interest from computational kine- 
maticians and researchers in computer algebra. That the platform has 
forty isolated solutions was first established computationally by continu- 
ation [Rag93] and elimination methods [Laz92, Mou93], and later proved 
analytically [Hus96], [RV95], and [Wam96]. 
A six-legged platform (similar to the general Stewart-Gough platform) 
which permits motion was presented by Griffis and Duffy in [GD93] and 
first analyzed in [HKO0]. It is called the Griffis-Duffy platform. Instead 
of forty isolated solutions we now consider a curve. In our formulation of 
the two cases we studied, twelve lines corresponded to degenerate cases 
deemed uninteresting from a mechanisms point of view. In the first case 
we were then left with one irreducible component of degree 28, while in 
the second case we found five components, four of degree six (one sextic 
was not reported in the analysis of [HK00]), and one component of degree 
four, see Figure 8.20. 


we L u 
Fig. 8.20. One component of the Griffis-Duffy platform. Starting at the configura- 
tion at the left, we see the clockwise rotation of the end platform. 


It is interesting to note that the running times for the factorization with 
the monodromy-traces method do not seem to depend on the particular 
geometry of the system, i.e.: the execution times are about the same in 
both cases, when we deal with one irreducible factor of high degree or 
with several factors of smaller degrees. 
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Summary. Polynomial factorization is one of the main chapters of Computer Al- 
gebra. Recently, significant progress was made on absolute factorization (i.e., over 
the complex field) of a multivariate polynomial with rational coefficients, with two 
families of algorithms proposing two different strategies of computation. One is rep- 
resented by Gao’s algorithm and is explained in Lecture 2. The other is represented 
by the Galligo-Rupprecht-Chéze algorithm, presented in Lectures 4 and 5. The latter 
relies on an original use of the monodromy map attached to a generic projection of a 
plane curve on a line. It also involves zero-sums relations (introduced by Sasaki and 
his collaborators) with efficient semi-numerical computations to produce a certified 
exact result. 


9.0 Introduction, definitions and examples 


9.0.1 Rational and absolute factorization 


A system of polynomial equations I = (g1,..., gn) corresponds to an algebraic 
variety V = V(g1,.--,9n). When the dimension of V is zero a natural question 
is: What is the cardinality of V, and what are the coordinates of the points 
of V? When the dimension is not zero this natural question becomes: What 
is the number of irreducible components of V, and what are the equations of 
these components? 

In the special case of one polynomial f(X,Y) € Q[X,Y], the answer to 
these questions is given by the absolute factorization of f(X,Y). The ab- 
solute factorization of f is the factorization f = f,...f,;, where the f; are 
irreducible in C[X, Y]. This provides the decomposition into irreducible com- 
ponents V(f) = V(fi) U... UV(fs). Now, let us tell a short story about 
absolute polynomial factorization. 

Polynomial factorization is one of the main chapters of Computer Algebra. 
The implementation of basic algorithms, derived from classical and elemen- 
tary commutative algebra, appeared in the first Computer Algebra systems 
in the 60’s. During the last 30 years, at every international conference in 
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Computer Algebra, there have been new contributions on factorization algo- 
rithms and their complexity. Several papers and books, including [vzG85] and 
[Kal90], relate the early history of these topics and provide a comprehensive 
bibliography till the end of the 80’s. 

The early authors and implementors, like Berlekamp, Musser, Wang, 
Zassenhaus and others, introduced many ideas that are now classical such 
as reduction and Hensel liftings, genericity and randomization. The LLL lat- 
tice basis reduction algorithm of Lenstra, Lenstra and Lovasz (1982) allowed 
for the first time a polynomial-time algorithm to be established for factor- 
ing univariate polynomials over the rational numbers. In the early 80’s, this 
was followed by many complexity results on univariate or multivariate fac- 
torization algorithms (over different fields), which are either polynomial-time 
or probabilistic polynomial-time. One of the first such results for multivariate 
absolute irreducibility testing, e.g. over the complexes, for a polynomial with 
rational coefficients, is due to Heintz-Sieveking [HS81]. It popularized in the 
community of Computer Algebra and Complexity the use of Bertini’s theorem 
(see Lecture 1) and was followed by many authors, including von zur Gathen 
and Kaltofen. 

The early works of Berlekamp (1967 and 1970), or Cantor and Zassenhaus 
(1981) for univariate polynomials over finite fields could run in quadratic or 
even subquadratic time. In practice, rational factorization of most polynomials 
can be computed efficiently using Hensel lifting; see e.g. Musser (1975) and 
Wang (1978). Lauder and Gao [Gao03] proved that the average running time 
of a Hensel lifting based algorithm for factoring bivariate polynomials over 
finite fields is almost linear. There are, however, infinitely many polynomials 
that need exponential time via Hensel lifting (see [Kal85b]). Although this can 
be improved, we can say that there are good algorithms and rather satisfactory 
implementations to perform rational polynomial bivariate (and multivariate) 
factorization. 

Absolute factors of a polynomial with rational coefficients have coefficients 
which are algebraic numbers. These can be represented either by elements in 
a precisely described extension Q(a) of Q or in C by imprecise floating point 
numbers which approximate them. This distinction gives rise to two families 
of algorithms: one kind which ultimately relies on linear algebra and can be 
developed on Q, e.g. the algorithms by Trager-Traverso, Kaltofen, Duval, Gao, 
Cormier-Singer-Trager-Ulmer (see Lecture 3), and another kind which uses 
topological properties of C?, Newton approximation or so-called homotopy 
methods and for which floating point approximations are better suited, e.g. 
the algorithms of Sasaki, Galligo-Rupprecht, Sommese-Verschelde-Wampler 
(see Lecture 4). Once such an approximate absolute factorization algorithm is 
available, it is still necessary to compute the exact factors. This has been done 
by Chéze-Galligo and will be discussed in Lecture 5. One can say today that 
the best algorithms were all discovered within the past ten years and there 
is still progress to be made. Another important preliminary topic is absolute 
irreducibility testing (see Lecture 2). 
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Now we introduce basic definitions and statements. 


9.0.2 Facts and definitions 


Definition 9.0.1. Let A be a domain. We say that A is a unique factorization 
domain if for alla € A— {0} we can write a = u.p,...ps where u is a unit 
and pj,...,ps are irreducible in A and this decomposition is unique up to 
reordering and multiplication by units. 


Example 9.0.2. Z and all fields are unique factorization domains. 


Theorem 9.0.3. [f A is a unique factorization domain then A[X] is a unique 
factorization domain. 


Corollary 9.0.4. Let k be a field, then k|X1,..., Xn] is a unique factorization 
domain. 

This means: for all PE k[X1,...,Xn],P = Pi... Ps (factorization), with 
P; irreducible in k|X1,..., Xn] and this decomposition is unique up to reorder- 
ing and multiplication by constant factors. 


Remark 9.0.5. Let k C K be an inclusion of fields and P € k[Xj,..., Xn]. 
P can be irreducible in k[X1,...,X,] but reducible in K[X1,...,X,]. For 
example: k = Q, K =C and X2+ Y* = (X +iY)(X — iY). 


Definition 9.0.6. Let K =k be the algebraic closure of the field k, and P € 
k[X1,...,Xn]. The factorization of P in K[X1,...,Xn] is called the absolute 
factorization of P. 


Exercise 9.0.7. Let P(X,Y) € Q[X,Y], and P(X,Y) = []j_, Pi(X,Y) its 
factorization in C|X,Y]. Show that this factorization is the absolute factor- 
ization (ie. P;(X,Y) € Q[X,Y)). 

a) Set Py(X,Y) = an(X)Y™ +--+ + ao(X). Show that for all x € Q, a;(x) 
belongs to Q. 

b) Let p(T) = ee piT* € C[T] such that for all x € Q, p(x) belongs to Q. 
Prove that p; € Q. (Hints: Write a Vandermonde system, and use Cramer’s 
rule.) 


There exist simple algorithms which compute absolute factorizations but 
are not efficient for degree > 15. For example, in Maple the command 


evala(AFactor(.)) 
implements an algorithm which we will explain below. 
evala(AFactor (X*2-2*Y~2)); 
gives 


(X-RootOf (Z* 2-2) Y) (X+RootOf (Z*2-2)Y) . 
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That means X? — 2Y? = (X — V/2Y)(X + V2Y). We remark that the two 
factors have the same monomials and their coefficients are conjugate over Q. 
The next lemma generalizes this remark. 


Lemma 9.0.8 (Fundamental Lemma). Let P € Q[X,Y] be a monic and 
irreducible polynomial in Q[X, Y]. P(X, Y) = Y"+an_1(X)Y"!+---+a0(X) 
with deg(a;(X)) <n 1. 
Let P = P,...Ps; be a factorization of P into irreducible polynomials P; in 
C[X,Y]. Denote by K = Qa] the extension of Q generated by all the coeffi- 
cients of Py. Then each P; can be written: 

P(X, Y) =yr+ Uacilag xy tose bo(ai, X), 
with by € Q|Z, X], degy (b.) < m—k, and where aj,...,a5 are the different 
conjugates over Q of a= ay. 


Proof. We can suppose that each P; is monic in Y, because P is monic in Y. 
We set P;(X,Y) = ¥™ + a_,y™-1 4... + af) (X) with a © Q[X] and 
deg y (a (Xx) <n; —k. Let K be the field generated by all the coefficients of 
P,; by the primitive element theorem we can set K = Q[a]. a is an algebraic 
number over Q and we denote by a, = a, Q2,..., a, its k different conjugates 
over Q, and by o4,...,a% the Q-homomorphism from Q[a] into C such that 
oi(a) = aj. 

Now we prove that k < s. Let M be the extension of Q generated by the 
coefficients of P,,...,P,; M is a finite extension of Q, and we have 

COMDKDQ 
We can extend to M all the o;. Then we extend o; to M[X, Y], and we denote 
this map by a;. We have o;(P) = a;(Pi)...a;(Ps) = P. Since Q[X,Y] is a 
unique factorization domain, there exists an index jo such that o;(P,) = Pj. 
Furthermore, if o;(P,) = 0;(P,) then 0; = 0;. So the map: 
e€UP, »{01,...,0K} > {P,,...,Ps} : Oj ++ a;(P1) 

is injective and k < s. 

If k < s we get an absurd result. Indeed, consider F = i d;(P,); this 
polynomial divides P so if we prove that F € Q[X,Y], we are done. 

Write Pi(X,Y) = 745 CapX°Y? where c:,;(T) € Q(T]. Thus 

F(X,Y) = Ta (Da p(ou)X°¥"). 

The coefficient of X*Y° is written 


S- Ci jt (a1)... Cin dk (ax). 


tate +ip=a 
Jite+jr=b 
It is a symmetric polynomial in aj,...,@%, so it is rational; we deduce that 


F(X,Y) € Q[X,Y]. 


Remark 9.0.9. For each P it suffices to get P, to describe the absolute factor- 
ization of P. 
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A first method 


Here we describe the absolute factorization algorithm implemented in Maple. 
It consists of the following 4 steps. 


Algorithm 9.0.10 TRAGER-TRAVERSO ALGORITHM. 
Input: f(X1,...,Xn) € ZX,..., Xn]. 


1. Compute a factorization in Z|Xy,...,Xn] and reduce to 2 variables (see 
below). 

2. For each irreducible factor P € Z|X,Y]: Fix an integer value a of X such 
that discy P(a) # 0. Factorize P(a,Y) over Z[Y], choose an irreducible 
factor q and make an alias: 8 = RootOf(q). 

3. Compute a factorization of P in Q(8)[X,Y], t.e. apply factor(P, 3). This 
does not provide a complete absolute factorization, but splits the polyno- 


t 
mials into (at least) two factors in K[X,Y] with K = Ql 


q(t) 
4. Lift the factorization. 
Output: An absolute factor of f. 


The first step uses Hilbert’s or Bertini’s theorem and the last step uses 
Hensel’s theorem. In Lecture 1 we will study these theorems. Step 3 is vali- 
dated by the following theorem (cf. [Kal85a] or [DT89]}). 


Definition 9.0.11. Let k be the algebraic closure of the field k, and let 
(a, 38) € k. We say that (a, 8) is a simple solution of P(X,Y) € k[|X,Y] 
when P(a, 3) = 0 and either 8F (a, 8) or gF (a, B) is nonzero. 
Remark 9.0.12. It is easy to see that if (a, 3) is a simple solution of P(X, Y) 
then (a, 3) is a simple solution of just one absolute factor of P. 

In step 2 of the algorithm we get a simple point (a, 3) of P(X,Y). 


Theorem 9.0.13. Let (a, 3) be a simple solution of P(X,Y). Then one ab- 
solute factor of P(X,Y) belongs to kia, B\[X, Y]. 


Proof. Let P = F\F ...F;, be the factorization of P in kla, G|[X,Y], where 
F, is such that F(a, 3) = 0. F, is the only factor with this property. Suppose 
that F, is reducible in k[X, Y]. We are going to show that this is absurd. 
Write Fy = [[*~ ;, Pj) where P; are absolute factors of P and suppose 
that P;,(a,@) = 0. With the same kind of arguments as those used in the 
proof of the fundamental lemma, we can show that there exist a k(a, 3)- 
homomorphism o and an index i; 4 i; such that o(P;,) = P;,. As Pi,(a, 3) = 
0, we have o(P;,(a,3)) = 0; we deduce that (a, 3) is not a simple solution 


and this contradicts the hypothesis. 
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Example 9.0.14 (see [Rag97]). Let P(X, Y) = Y1°-2X7Y4+4X°Y?—-2Xx10. 
It is an irreducible polynomial in Q[X, Y] since P(1,Y) = Y19-2Y*+4Y?—2 


ie 
is irreducible in Q[Y]. Let K = 4 aT The factorization of P over K[X,Y] 


is the following: 


>alias (beta=RootOf (x*{10}-2*x74+4*x72-2)); 
>factor(P, beta) ; 

(¥° -—29V°*NX6+29V7X 2 —Y7X p> -—Y¥72Xp! —V7X G9 +2X°p-—2X°f? + 
Xp? 4 KG + KOO (V+ OVAXG—2VAX 4+ YIX P+ YX p+ Vx — 
2X°84+ 2X88 Xp XB" X*6°) 


The time needed by Maple for this computation is 691.531 seconds (on a 
small PC). Here P,(X,Y) = Y° + (6° + 6" + 6° — 263 + 28)Y?X + (—6° — 
GB" — B° +263 — 28)X° is an absolute factor of P and it satisfies P,(a, 3) = 0. 

The method is simple but the drawback is that K is too big. A smaller ex- 
tension will work better and faster, in most cases. In our example, an absolute 
factor is G(X, Y) = Y°— V2XY + V2X°. Computing factor (P,sqrt(2)) 
takes only 0.27 seconds. 

Note that this first method relies on a rational factorization algorithm. 


9.0.3 Rational factorization 


We can quickly summarize the polynomial factorization process over a finite 
extension of Q with the following diagram. 


P € Q{a][X, Y] with g(a) = 0 and g € Z[T] is irreducible. 
| yo generic Hensel lifting ] 
in (y — yo) 
1 
F(X) = P(X, 40) € Qa)[X], fe HZlallx] 
p a generic Hensel lifting 
prime number in p) 
—~ 2 sie Seats i a sh ; 
Df € —(@)[X] with G(@) = 0 and 7 € —[T] is irreducible 
pL pL 


We see that Hensel’s lifting is a very useful tool; we recall Hensel’s theorem 
in Lecture 1. 


9.1 Lecture 1: Theorems of Hilbert and Bertini, 
reduction to the bivariate case, irreducibility tests 


9.1.1 Hilbert’s irreducibility theorem 


We present a simple version of Hilbert’s theorem (see [Lan83] and [Zip93}): 
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Theorem 9.1.1 (Hilbert’s irreducibility theorem). Let K be a finite al- 
gebraic extension of Q, and let f(Ty,...,T;,X1,..., Xs) be an irreducible poly- 
nomial in Q|T,...,T;,X1,--.;Xs]- Then almost all points (ti,...,ty) © K" 
are such that f(t1,...,t-,X1,...,X.) is irreducible in K[Xy,..., Xs]. 


Remark 9.1.2. This theorem is false if we replace K by a finite field F. Consider 
f(X,Y) = X?-Y, then for all points ¢ of F = Z/2, f(X,t) is reducible. Below 
we will study Bertini’s theorem which works also for finite fields. Furthermore, 
Bertini’s theorem gives rise to a probabilistic statement. 


9.1.2 Hensel’s lemma 


The idea of Hensel’s lemma is to mimic Newton’s method in an algebraic set- 
ting. Newton’s iteration method gives an exact solution from an approximate 
one; here the approximation is J-adic, where J is an ideal in a ring A (see 
[Eis95], [Gou97], [Zip93] for the definition of the complete ring A; and its 
properties). 

Hereafter all rings are commutative, unique factorization domains, and 
Noetherian. Indeed in our setting A is one of the following rings: 

Z, 2X1, see Xn], QX1, see Xn],C[X1, aa. Xn]; where n = 1. 


Theorem 9.1.3. Let I be an ideal of a ring A. Let 
r= CPB igst eg Beles pF yl Atgscageta)) 
be polynomials over A, and denote their Jacobian with respect to the X; by 


Jac. Let (a1,...,%n) be a zero of F modulo I, such that the determinant 
Jac(x1,..-,2%n) has an inverse in —. 
Then there exist unique elements (41,...,%n) of Ar, &; = x; mod I for 


which F(41,...,4n) = 0. 
This implies (see [Zip93]): 
Theorem 9.1.4 (Hensel’s lemma). Let f(X) be a monic polynomial over 
A, and I be an ideal of A. Assume there exist monic polynomials gi(X), hi(X) 
A 
in 7X! which are relatively prime and such that f(X) = gi(X)hi(X) 
mod I. Then for everyn € N—{0} there exist monic polynomials gy(X), hn(X) 
A 
over ja lX] such that gn(X) = gi(X) mod I, hy(X) = hi(X) mod I and 
F(X) = gn(X)hn(X) mod I”. 


Furthermore, there exist unique polynomials g(X) and h(X) over Ay such 
that g(X) = gi(X) mod I, h(X) = hi(X) mod I, and for alln EN, f(X) = 


§(X)h(X) mod I”. 
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Lifting a factorization 


Now we show how to get a factorization in C[X1,...,X,] from a factoriza- 
tion in C[X1, X9]. Let f(X, Paras sn), fi(%, sas Xn) and fo(X, Paras ,Xn) be 
polynomials of C[Xy,...,X,] such that 


f(X1,-.-;Xn) = fa(%1,.--, Xn) fo( M1, .--, Xn) 


is the absolute factorization of f. We know f and we want to find the f;. We 
set d; = degy,(f). Let J = (X3—23,..., Xn—%n) bean ideal of C[X1,..., Xn] 
where x; € C fori =1,...,n. We set 


fo = FX, Ka We Be) 


Cn se 


the image of f in 7 “, similarly 


(© = f(X1, Xo, X3, 4, eters , Ln) 


and so on. We will get recursively a factorization for all f(") f(X1,...,Xn). So 
we start with an absolute factorization for f() namely f(?) = g,h,. Applying 
Hensel’s lemma (with A = C[X2, X3] and I = (X3 — 23)), we can lift this 
factorization to C[X2, X3][X1] and get: 


F = gayihay41 mod (X3 — a3)**". 
The degree condition with dz and the unicity property in Hensel’s lemma 
imply the following: 


fi(X1, X2, v3, tee , In) = gi(X1, X2) 
if (1) 
fo(X1, Xo, %3,..-%n) = hi (X1, X2) 


Gdg+1(X1, X2,X3) = fi(X1, X2, X3,24,..., Fn) 
then (2) 
hag+1(X1, X2, X3) = fo(X1, X2, X3,@4,..- Ln). 


So we obtain a factorization of f(). Now we restart with A = C[X2, X3, X4J, 
[= (X4 = Xa) and 
FP Cs see Xn) = f(X1, Xo, X3, X4, U5, tee ti) 
= Jdg+1(X1, X2, X3)Rag41(X1, X2,X3) mod (X4 — x4) 


Then, after n — 2 liftings, we obtain the factorization of f. 


Remark 9.1.5. We supposed that condition (1) is true, that is to say: fi(X1, Xe, 
U3,-++,Ln) and fo(X1, X2,x3,...,2n) are the absolute factors of f(X1, X2, x3, 

.,@n). In other words we supposed that the f;(X1, X2,23,...,@n) are irre- 
ducible, but this is not always the case as we might have the following kind 
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of phenomenon (see [Kal85b}): 

F(X, Y) = X4*4+(12Y3—-18Y?—-18Y +12) X3+(30Y3 —72Y? +42Y — 36) X?+ 
(—432Y3 + 648Y2 + 648Y — 432) X — 432Y? + 2592Y2 — 2160Y is absolutely 
irreducible but f(X,0) = (X — 6)X(X + 6)(X + 12) is reducible. 

In order to avoid this situation which gives rise to an algorithm with 
an exponential complexity, we want to obtain a reduction from multivariate 
polynomials to bivariate polynomials preserving irreducibility. Using Hilbert’s 
theorem we do not know the probability for a polynomial to remain irreducible 
after a substitution. Bertini’s theorem, which we now study, provides this 
knowledge because it implies a useful probabilistic statement. 


9.1.3 Bertini’s theorem 


In the beginning of the 20°” century, Bertini proved two important theorems 
which bear his name; S. Kleiman gave in [Kle98] a comprehensive report on 
the history, evolution and impact of these theorems. Although it was phrased 
in the terminology of his time, Bertini’s second theorem says the following: 
The intersection of an irreducible algebraic set in C” of dimension r > 2, with 
a “generic” (n—r+1)—plane is an irreducible curve. Of course the meaning of 
the adjective generic has to be specified. Together with considerations of more 
general varieties this gives rise to several versions of the theorem. A complete 
treatment is provided by Jouanolou’s book [Jou83]. 

In the community of Computer Algebra and Complexity theory, Bertini’s 
(second) theorem was first used and popularized by Heintz-Sieveking [HS81] 
in 1981, soon followed by Kaltofen, von zur Gathen and many others. In 
[HS81], the following “General hyperplane section lemma” is stated, and a 
short algebraic proof is given in an appendix. 


Lemma 9.1.6. Let k be an algebraically closed field and X1,...,Xn be in- 
determinates over k. Let P be a prime ideal of k[X1,...,Xn] which defines 
an affine subvariety of k” of dimension r > 2. Let Aj,;, Ai, i= 1 tor—1, 
j=1 ton, be transcendental quantities over k, and let K be an algebraically 
closed field containing k, Ai; and Aj, i= 1 tor—1,7=1 ton. Then, the 
ideal P + (XY = Dagst A, j Xj = Al, eae ~Xp-1 = Sra Ap—1,5Xj = A,-1) 18 
a prime ideal in K(X1,...,Xn]- 


The proof relies on College algebra (i.e. ring and field extensions). The 
work [HS81] applies this lemma to the case k = C, P = (f), f € QIMX1,..-, Xn], 
r = n-—1. By successive substitutions of X; by isi Aig X; + A;,i=1 to 
n—2, in f(X1,...,Xn), one obtains a bivariate polynomial fo(Xn_-1, Xn), 
whose coefficients are polynomials in A;,; and Aj. 


Corollary 9.1.7. f is absolutely irreducible <> fo is absolutely irreducible. 


Then it is proven in [HS81] that this claim still holds if we replace the 
indeterminates A;,; and A; by random values. They also provide bounds which 
allow to control this randomness. 
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In 20 years, these bounds and the presentation of the result have been 
improved by several authors including Kaltofen, von zur Gathen, and Gao. 
The sharpest and best result (as far as we know) is due to Gao and is a 
consequence of his method that we will present in the next section. 


Theorem 9.1.8 (Gao’s 2000 improved version of Bertini’s theorem). 

Let K be a field and § a finite subset of K. Let f € K[X1,...,Xn] of total 
degree d and f(x,y) = f(ayx + bry + c1,---,Gn2 + bay + Cn). Suppose K is 
either of characteristic zero (e.g. K = Q) or of characteristic larger than 2d?. 
Then, for random choices of a;,b;,c; in S, with probability at least 1 — a 
all the absolute irreducible factors of f remain absolutely irreducible factors 
of fo in Kia, y}. 


We give a sketch of the proof in exercise 9.2.14 below. 

This theorem and an absolute bivariate factorization algorithm, give a 
randomized algorithm for factoring absolutely multivariate polynomials in 
Q[X1,..., X,]. The strategy is as follows. If d is the total degree of the input 
polynomial f(X1,...,X»), one chooses random values ay, b1, C1, ---, Gn, Dn, 
Cn from a set S in Q with |S| > 4d? and factors the bivariate polynomial 
fo(aiz + biy + c1,..-,dn2 + bny +n) over Q. With probability at least 1/2, 
the factors of fg correspond to the factors of f evaluated at X; = ajx+bjy+c;, 
t=1 ton. 

A finer general hyperplane section lemma than the one stated in [HS81], 
is given by part 4 of Theorem 6.3, page 67, in the book of Jouanolou [Jou83]. 


Theorem 9.1.9 (Jouanolou’s 1983 version of Bertini’s theorem). Let 
k be an infinite field, X a scheme of finite type and f : X — k” a 
k—morphism. Suppose that dim f(X) > 2 and X is geometrically irreducible. 
Then, for almost all affine hyperplanes H in k", f~1(H) is geometrically 


irreducible. 


In our setting, k = Q, X is an irreducible variety embedded in an affine 
space k™ and defined by a prime ideal P of k|X,,..., Xm] of dimension r > 2. 
For m such that m > r > n > 2, we project X surjectively onto the affine 
space k” x 0 included in k”, by f which is here the canonical projection. X 
being geometrically irreducible means P absolutely prime. H is defined by 
linear equations L which involve only the first n coordinates, and f~!(H) 
geometrically irreducible means P + L is absolutely prime. 

Then we see that this statement improves Lemma 9.1.6 because if we 
choose n = 2 and get a surjective projection, then the generic linear equations 
of H depend only on the first 2 coordinates. 


9.1.4 Irreducibility testing 


In the factorization process an important preliminary task is to test if a poly- 
nomial is already irreducible. This is the case for rational factorization as well 
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as for absolute factorization. 

For example the polynomial P(x) = 52+ + 17x? — 15x”? + 12% + 19 € Z[X] is 
irreducible because modulo 3 we get P(X) = 22442241 = —(a4+a23-1) € 
A(X | which is irreducible. Indeed, if we had P = P,.P) in Z[X] with 
deg(P,) > 0 and deg(P2) > 0, then Ic(P;).le(P2) = 5 = —1 mod 3, P; and 
P» would satisfy P = P,.P2 and deg (P,) > 0, deg(P2) > 0. This reduction 
argument is very general and allows to reduce absolute irreducibility testing 
to the bivariate case using Bertini’s theorem. 

One can use a similar argument to derive the following simple absolute 
irreducibility test which was successfully studied by J.F. Ragot in his Ph.D 
thesis [Rag97]. It is efficient, easy to implement and works in most cases. Let 
P € Z|X,Y] be irreducible. The idea is to find a simple point of the curve 
P(a,y) = 0 over a for some prime p (see Definition 9.0.11); it proceeds by 
an extensive sieve. 

We denote by (*) the following condition: 


Z, 
f is irreducible in Aa Yi; 
Pp 
Z\?2 
there exists a simple point (a,b) € (=) of f mod p, 
Pp 


(*) 


the degree of f mod p is equal to the degree of f. 


Algorithm 9.1.10 RAGOT’S ALGORITHM 


Input: f(X,Y) € ZX, Y] 
For p from 2 to (say) 101 do 
if f mod p satisfies (*) then return (“f is absolutely irreducible”) end if. 
end for. 
return (“I don’t know”) 
Output: “f is absolutely irreducible” or “I don’t know”. 


Often, the mathematical idea behind an irreducibility test can be extended 
to get a factorization algorithm. This was the case for the irreducibility test 
of Ruppert [Rup99], whose idea was later reused by Gao [Gao03] as we will 
see in Lecture 2. This was also the case for the absolute irreducibility test of 
Galligo and Watt [GW97], which was developed into a factorization algorithm 
by Galligo and Rupprecht and was later improved by Cheze as we will see in 
Lectures 3 and 4. 

Before concentrating in the next subsection on generalizations of Eisen- 
stein’s criterion, let us mention another active direction of investigation. 
It deals with multivariate polynomials with complex coefficients, which are 
known only with a given precision. See e.g. the works of [Nag02] and [KM03]. 

Eisenstein’s classical theorem (see e.g. [Eis95]) states that: 


Theorem 9.1.11 (Eisenstein’s criterion). Let R be a unique factorization 
domain and f = fo+ fiX +::-+ fnrX” © R[X]. If there is a prime pe R 
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such that all the coefficients except fr, of f are divisible by p, but fo is not 
divisible by p*, then f is irreducible in R[X]. 


Example 9.1.12. Let R = C[Y], p= Y and f(X,Y) = X°+2YX*+(Y?4 
3Y)X3 + (Y3 + 6Y)X?4 (Y44+ 10Y?+Y)X+Y +1, then f is absolutely 
irreducible by Eisenstein’s criterion. 


Several mathematicians (Dumas [Dum06], Kurscak [Kur23], Ore [Ore23], 
[Ore24b], [Ore24a], Rella [Rel27]) have generalized this criterion by using New- 
ton polygons. 

Construct a polygon in the Euclidean plane as follows. Suppose that the 
coefficient f; is divisible by p* but not by any higher power, where a; > 0 
and a; is undefined if f; = 0. Plot the points (0,ao), (1,a1),..., (n,@n) in the 
Euclidean plane and form the lower convex hull of these points. This results in 
a sequence of line segments starting at the y—axis and ending at the x—axis, 
called the Newton polygon of f (with respect to the prime p). Dumas [Dum06] 
determines the degrees of all the possible nontrivial factors of f in terms of 
the widths of the line segments on the Newton polygon of f. Consequently a 
simple criterion for the irreducibility of f is established. 


Theorem 9.1.13 (Eisenstein-Dumas criterion). Let R be a unique fac- 
torization domain and f = fot fiX +:--+ faX” © R[X] with fofn A 0. 
Assume that f is primitive, t.e. gcd(fo,.--,fn) = 1. If the Newton polygon 
of f, with respect to some prime p € R, consists only of a line segment from 
(0,m) to (n,0), and if gcd(n,m) = 1, then f is irreducible in R[X]. 


The condition on the Newton polygon means that a; > (n — i)m/n for 
0 <i<n where p”™ exactly divides f;. When m = ap = 1, this condition is 
the same as in Eisenstein’s criterion. 

A somehow related criterion is used for local irreducibility of Weierstrass 
polynomials for instance in the proof of the Newton-Puiseux theorem. A series 
in Weierstrass form f = Y" + 0%9 ai(X)Y* € C[[X])[Y], with valuation 
(a;(X)) = v; and vp = M, is irreducible in C{[X]][Y] when gcd(n,m) = 1 and 
vi > dm for i=1 to n — 1. In that case, the upper Newton polygon of f 
with respect to Y has only one segment [(0,);(m,0)]. Of course, this also 
holds if we replace the series by polynomials vanishing at 0. 

These two criteria have been generalized by Gao. After Ostrowski (1921 
and 1970), he considered not only lower or upper Newton polygon but the 
complete convex hull of the support {(i,j)/ai; 4 O} of P= 0, , aij X'Y!. 
As application of this theorem, Gao obtained for instance the following nice 
special criterion for absolute irreducibility for bivariate polynomials. 


Theorem 9.1.14. Let F be any field and f = aX” + bY™ + cX"Y? + 
Sagxy? € F[X,Y] with a,b,c nonzero. Suppose that the Newton polytope 
of f is the triangle with vertices (n,0), (0,m) and (u,v). If ged(m,n, u,v) =1 
then f is absolutely irreducible over F. 


9 Absolute Factorization 351 


Example 9.1.15. P(X,Y) = X°+Y?4+ X°Y4 + X4Y? 4+ X8Y24 X°V 4+ XY? 
and Q(X,Y) = 2X°+ 7Y? + X°Y4 + 10X*Y? + X3Y? + X7Y + 3XY? are 
absolutely irreducible. Here n = 3, m = 2, u= 5, v = 4. The Newton polytope 
of these two polynomials is shown in Figure 9.1.15: 


Fig. 9.1. The Newton polytope of an absolute irreducible polynomial. 


Gao and his coworkers wrote several papers on these topics (e.g. [Gao01}). 
They contain an extensive bibliography on the subject. 


9.2 Lecture 2: Factorization algorithms via computations 
in algebraic number fields 


The first algorithms by Trager-Traverso, which we explained in the introduc- 
tion, and the early ones by Kaltofen were used for getting complexity bounds. 
But they were hardly efficient for degrees greater than 15 because they re- 
quired the solution of huge linear systems over large algebraic number fields. 
Then in the late 80’s, D. Duval presented in her PhD thesis an algorithm 
relying on classical algebraic geometry of complex curves and algebraic func- 
tion fields. This algorithm was able to compute, in a first step, the number of 
absolute irreducible factors of a polynomial, and a minimal extension which 
contains the coefficients of one factor. 


9.2.1 Duval’s algorithm (1987) 


Let P(X,Y) € Q[X,Y] be irreducible. k is the algebraic closure of Q in 


Q(z, y) = a = K. Let C be the curve in C? defined by P(X,Y) = 0, 
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K = QC) is the field of Q—rational functions on C and k is called its subfield 
of constants. 


Theorem 9.2.1. The number of absolute irreducible factors of P is [|k : Q). 
Moreover, P factorizes as follows: P = [],o;(G) where G is the minimal 
polynomial of y over k(X), and o; are the Q—isomorphisms of k in Q. 


Proposition 9.2.2. k is a subfield of Ox, the ring of integers of K over 
Q[X]. 


These results give rise to an algorithm. 


Algorithm 9.2.3 DUVAL’S ALGORITHM 
Input: P(X,Y) € Q[X, Y] irreducible in QLX, Y]. 


1. Compute a basis of Ox over Q[X]. 
2. Compute a basis of k over Q. 
8. Compute an absolute factorization of P. 


Output: An absolute factor of P(X,Y). 


The first step relies on the work of Ford-Zassenhaus or Dedekind-Weber. 
The second step relies on the study of parameterizations of C by Q|[¢]] and 
valuations, in order to get special bases for Ox on which one reads the result. 
This study is related to normalization. The third step results from a gcd 
computation in Q(a)(X)[Y]. 

This algorithm was implemented by J.F. Ragot in 1994, in Maple. It was 
more efficient than the former ones. But Gao’s algorithm, which we now 
present, is simpler and more efficient. 


9.2.2 Gao’s algorithm for absolute factorization 


Gao’s algorithm is based on a geometric idea inspired by the proof of an irre- 
ducibility theorem of W. Ruppert [Rup99] and by the work of H. Niederreiter 
[Nie93] on factorization. In these notes we specialize Gao’s algorithm to the 
following input and output. 

Input: P(X, Y) € Q[X, Y], irreducible in Q[X, Y]. 

Output: 


1. The number d of absolute irreducible factors. 
2. A minimal polynomial g of a, deg(q) = d. 
3. An absolute irreducible factor of P = P,(X,Y) € Q(a)[X,Y]. 


Briefly, the algorithm will produce a Q—vector space F’ of dimension d, 
whose elements are some rational solutions of a partial differential equation. A 
basis of this F is computed by solving a rather large system of linear equations. 
As we will see, a basis of F = Fg Q consists of s polynomials closely related 
to the d factors of P(X,Y). 
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Remark 9.2.4. A version of Gao’s algorithm can be also used for finding a 
rational factorization. Its complexity is polynomial and efficient (“almost” 
quadratic for dense inputs). It extends to an effective Bertini irreducibility 
theorem. 


In order to ease the exposition of Gao’s approach, we point out two key 
observations in his method. 


First observation 


Let P(X, Y) =[]}_, Pi(X, Y). Taking logarithms, this implies log(P(X, Y)) = 
Ti log(P,(X, Y)). Set 


éP os dP, OP _o< OP, 
ax = 1) ay a= LU) ay 
i=1 ji i=1 ji 
eS 
Ii hi 
0 1 OP; gi a) hi 
Then we have ax (los Pi) = Pax Pp and ay (log P;) = 


The following relation expresses the classical Schwartz equality on the 
second derivatives: 


(ar(p) = axlp “) for i=1 to s, and Ya Bp = ap 


Moreover, we define the bidegree of a polynomial f € Q[X, Y] by: 


bideg(f) = (deg x (f), degy(f)) = (m,n). 
If the previous factors P; are in C[X, Y], we have with respect to the natural 


partial ordering: 
(xx) bideg(gi) < (m— 1,n) 
bideg(h;) < (m,n — 1). 


Therefore, to a polynomial factorization of P we naturally attach a set of 
polynomials (g;,h;) which satisfy (*) and (*«*). In 1986, W. Ruppert derived 
a condition for absolute irreducibility from similar data. 

Let us note for algorithmic purposes that (x) can be rewritten linearly in 
gi and h; as 


/ i, 7 — 
() Play — ax) t hag — Say = 9 


Definition 9.2.5. Let F be the Q—vector space of solutions (v, w) € Q[X, Y]? 
of the PDE 


Ov Ow 
() oy (p) = axp) 
such that bideg(v) < (m—1,n), bideg(w) < (m,n —1). 
Moreover we set F = F ®o Q. 
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Remark 9.2.6. A basis of F' can be computed by solving a linear system with 
m(n+ 1) +n(m-+ 1) unknowns and 4mn equations. 


Theorem 9.2.7 (Gao’s theorem). Let P be irreducible in Q(X,Y], and 


8 ous P; 
TT; Pi be an absolute factorization of P. Let g; = jai P. aya and h; 
OP; 


Lj« Pes DY" Then dimg F = dimgF is the number of factors s of P. 
Moreover {(g1,h1),.--,(gs,hs)} is a basis of F. 


The proof of this theorem uses partial fraction decompositions over Q|X](Y) 


and of It relies on the following lemma, which allows 


Ov 0 
| aaa (lai 
of ay (p) ax (p). 
grouping of terms by conjugacy classes. 


d _ 
Lemma 9.2.8. Let k = Q(X), let the derivation a: extend to K = k, and 
x 


d d 
assume we have ag) CK. Let a be algebraic over Q(X); if a = 0 then 
oS na 
aEeQ. 
Proof. Let T = T(z, X) = 2! + y_1(X)z'-1 +--+ + u9(X), vj € Q(X) be the 
unique minimal polynomial of a. 
Since T(a, X) = 0, taking the derivative (with respect to X) of this com- 
OT d OT d 
5 (a, X) am (a, X) = 0. As — =0 we get 
Zz 
T = 
OF eit) = 0 which can be written ae (X)al-1+...-4 —(X)= 
If this were not identically zero, it oe contradict the fact that T is the 


= 0, therefore v; € Q and a€ Q. 


position of functions gives 


1 pol lofa.S t == 
minimal polynomial of a. So we ge axa 


Proof (Gao’s theorem). Let n be the degree of P in Y. Consider the factoriza- 
tion P = [[;_, P; where P; € Q[X,Y]. We can decompose this factorization 
further over K[Y], with K = Q[X], and get: 


x) TIC Y — 9;(X)), yp; € K. 

j=l 
We set 

P; = ui(X) [] (¥ - 95(X)) 
jel; 

where the disjoint union of the J; gives {1,...,n}. The y,, j € Jj, are conjugate 
over Q[X]. Now a and ~ act on K: ax *) CK, ay *) = 0. We have 
a unique partial fraction decomposition in K[Y]. As degy-(w) < degy-(P) and 
degy(v) < degy(P), we have: 


a iF oe , where A; _ W(X 9(X)) 


BPX, y,(X)) © 


? 
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= Vj. = a ee € K,andvw€ K. 
By (*) we 
8 (ty yr OANOX) yr A264 
OX ‘P = Y — 9; = (Y — yj)? 0X 
= a @ ) _ - Bj 
OY °P a (Y — ;)? 


dA; _ 

So for all 7 we have oa = 0, and the previous lemma implies A; € Q. 
Moreover, for each 7, 7 = 1 to s, by conjugation, for all 7 € I; the A; are 
conjugated over Q(X). But since A; € Q they are equal, so we denote by 4; 
ich Vy, = Vin Xi and we 


: W 8 
their common value. Hence ao eG 


get w = >y_, Ain; as claimed. 
ae = Se 1 Gi 

OY P 

oa dae = ay 


Now we get —~ ) = 0 (because (v, w) and (g;, hi) satisfy (*)); 


this implies 
then v = 771 Aigi- 


We deduce two corollaries. The first is an irreducibility criterion and the 
second gives rise to Gao’s algorithm. 


X). As degy(v) < m—1 we get A = 0 and 


Corollary 9.2.9. P is absolutely irreducible if and only if dimg F = 1. 
Corollary 9.2.10. If v= >>;_, A\igi, where 4 # A; fi #7, then 


OP 
i= P= a 
gcd(P,u — A ) ) 
Exercise 9.2.11. Prove Corollary 9.2.10. (Hint: $ as = 1 :-) 


Second observation 


Now we are in the following situation: given v in EL, we want to know its 
coordinates 1,...,As, in the basis {g1,...,gs}. Call E the first projection of 
F,E={ve OLX, Ye <(m—1,n)|4w such that (v,w) € F}. Note that EF can be 
embedded in Q[X,Y]/P. We set EF = E®9Q, E=Q < u,...,va >, and 
E=Q<4,..--,9a >. 


OP; é OP 
We have Gi= jz Pi ay and beer: = ax’ so 


foe 99; =9 mod P 


Vi g?= Iinx mod P. 
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s OP 
Let v = D0j-1 Asgi, then vg; = DD; \igi93 = AYOF = AGIs AY mod P. In other 


words, the A; are the eigenvalues of the following linear map: 
mult, : B= vie 


Saye ee OP OP 
where E = Q<m=s,---59s >. In other words, the \; are the roots of 


ax’ ax 
the characteristic polynomial qy(t) = Penar(mult,). 
We can compute q,(t) for v € E from our knowledge of the basis v1,..., va 


of E. Indeed if B is the matrix of the change of basis from {v,...,va} of E 
to {g1,---, ga}, then it does the same job for E’. So 


dv(t) = det(B) - Pehar(mult,) det(B~*). 
Finally we get the following algorithm: 


Algorithm 9.2.12 GAO’S ALGORITHM 
Input: P(X,Y) € Q[X,Y], irreducible in Q(X, Y]. 


1. Choose a “generic” v in E. 
2. Compute qu(t) = Penar(mult,), irreducible in Qt]. If qu(t) has a multiple 
root then go to the last step. _ 
3. Call 1 a root of q(t), then v — Px € Q < ga,...,ga >, therefore this 
polynomial is divisible by P,. 
t 
4. P, = gcd(P,v — Ai Px) in Qe] 
qu(t) 
Output: An absolute factor Pi(X,Y) € Q|Ai][X,Y], and the minimal polyno- 
mial qy(t) of A, over Q. 


[X, Y] = Q(A1)[X, ¥}. 


Implementation, examples, and exercises 


Among the implementations of Gao’s algorithm there is one by J. May in 
Maple. It is well commented and available on the web!. We downloaded and 
tested it on the example described hereafter. 

We consider a polynomial R(X,Y) € Q[X, Y] with bidegree (12,12). The 
PDE is written as a linear system: the number of coefficients of (v,w) is 
2 x (12 x 13) = 312, and the number of equations is 24 x 24 = 576 but many 
are identically 0. This gives rise to a vector space F of dimension 4 over Q, 
E =< v1, 0V2,U3,U4 >, where each v; is a polynomial of degree 11. 

We choose a generic linear combination v = 8v; — v2 — v3 + v4, then write 
the 4x 4 matrix mult,: we first compute v1 oF V2 oe U3 ge U4 ge reduced by 
P, to get 4 polynomials of degree 11. 


' nttp://www4 .ncsu.edu:8030/jpmay/ECCADO1/ 
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Then we compute the reduction of v1 -v,v2-v,v3-+v,v4-u by P to get 
4 polynomials of degree 11. After that, we solve 4 systems with 4 unknowns 
and 276 linear equations. We first obtain the irreducible polynomial 
lis ; 69791 12 73148 | _ 253583 
3° 2313 2313-20817” 

And finally we get an absolute factor. The total time (on a small PC) was 95 
seconds. 

J. May and E. Kaltofen have also developed recently a version of this 
algorithm adapted to imprecise input data, i.e. polynomials with floating point 
coefficients. See [K M03]. 

Theorem 9.2.7 and then Gao’s algorithm works for a field F of character- 
istic zero and for fields of characteristic p > (2m — 1)n. Now we present an 
exercise which follows the idea of Ruppert (see [Rup99], and [Gao03]). For an 
explicit bound M in the following exercise you can see the proof in [Rup99]. 


qQ=r 


Exercise 9.2.13 (Absolute irreducibility modulo p). 
a) Show that X4+ 1 is an irreducible polynomial of Z[X]. 
b) Show that X*+1 mod p is reducible for every prime p. 
c) Show that these kinds of phenomena cannot appear when we study an 
absolutely irreducible polynomial P € Z[X,Y]. That is to say: show that if 
P(X,Y) € Z[X,Y] is absolutely irreducible, then there exists an integer M 


nd [X,Y] is 


absolutely irreducible. (Hints: Apply Theorem 9.2.7 to P, study the rank of 
the linear system (see Definition 9.2.5) related to P and to P mod p, and 
apply Theorem 9.2.7 to P_ mod p.) 


such that for every prime number p > M, P(X,Y) mod p € 


Finally, we give an exercise on the proof of Bertini’s theorem (Theo- 
rem 9.1.8). 


Exercise 9.2.14 (Bertini’s theorem). 

a) Show that we can assume that f is square free. 

b) Suppose that f has r absolutely irreducible factors. Show that fp = 
f(arX + bY + c1,...,A€nX + bnY + cp) has r absolutely irreducible factors 
over L = F(a1, 61, ¢1,---,@n;0n, Cn). (Hint: Use Corollary 9.1.7). 

c) Consider the linear systems for fp over L (see Definition 9.2.5) and let 
be the associated matrix. Let N be the number of unknowns of the system. 
Show that rank M < N—r, and that N < d(d+1). (Hint: You can replace 
Q by L in Theorem 9.2.7.) 

d) Show that there is an (N —r) x (N —r) submatrix M, of M whose deter- 
minant is nonzero, and that all the (N — r+ 1) x (N —r+1) submatrices of 
M have determinant zero. 

e) Apply to det(M;) the following lemma (see [Zip93, p.192]) and conclude. 


Lemma 9.2.15. Let P € A[X1,...,Xn] be a polynomial of total degree D 
over a domain A. Let S be a subset of A of cardinality B. Then 
P(P(a1,---;2n) = On; € 8) < B. 
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9.2.3 An algorithm using an ODE 


In the same family of algorithms for computing an absolute factorization, 
we also mention the work of Cormier et al. [CSTU02]. Their algorithm uses 
for each polynomial P an adapted ODE which is a suitable generalization 
of a minimal polynomial but with respect to differential Galois theory. It 
has been implemented, it does not seem more efficient that Gao’s algorithm. 
Nevertheless it has its own theoretical interest. 


9.3 Lecture 3: Factorization algorithms via computations 
in the complex plane 


This section and the next one describe the main contributions of the two 
authors to absolute factorization. It is divided into three parts. 

The first one contains results on topology and algebraic geometry of plane 
curves. The second describes their absolute factorization algorithm. The third 
part briefly presents other contributions related to the use of floating point 
and/or monodromy methods. 


9.3.1 Topology and algebraic geometry of plane curves 
Basic definitions and classical results 
Here we recall some classical results, for the proof we refer e.g. to [Rot88}. 


Definition 9.3.1. Let X be a topological space, and xq a point of X. Let 
I'(X,20) = {y € C0, 1], X)|y(0) = y(1) = zo} be the loops space on X. 
Homotopy between loops, denoted by the symbol ~, is an equivalence relation 
on I'(X,209). We denote by [y] the homotopy class of y, and by 7(X, 20) the 
T(x, Xo) 


~N 


set 


It can be shown that 7 (X,2g) equipped with the concatenation is a group, 
in general non commutative. 


Example 9.3.2. Take for X the complement in the real plane of a set of N 
points; X = R? — {py,...,pw}. Then 7(X,29) is a free group generated 
by N small loops around the points p;. See Figure 9.2 where C = R? and 
A= {pies DN} 


Definition 9.3.3. IT : Y — X is a covering if IT is continuous and if all 
x € X have an open neighborhood U, such that IT~!(U,) is a disjoint union 
of open sets V; in Y, with ITjy,: Vi + Ux is an homeomorphism for every 1. 


We call Ty, :U, — Vj; a section of IT. 
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Proposition 9.3.4. Let X be a connected topological space, and let IT: Y - 
X be a covering. If there exists x9 € X such that the cardinality of IT~(xo) 
satisfies |IT~(a9)| = N then for all x in X, we have |IT~*(x)| = N. In this 
situation, II (or Y ) is called an N—fold covering. 

Exercise 9.3.5. Let P(X, Y) = Y" + an_1(X)Y¥""' +---+a0(X) be a poly- 
nomial in C[X,Y], C = {(z, y) € C?|P(z, y) = 0}, pri : C? — C be the projec- 
tion on the first coordinate and A = {x € C|Discy(P(X,Y))(x) = 0}. Show 
that pry je_pp-t(ay 1 © — pr, (A) — C — Ais an n—fold covering. (Hint: use 
the implicit function theorem.) 


PV /c-pry (A) 


Fig. 9.2. A ramified covering with a smooth generic fiber. 


Theorem 9.3.6 (Lifting lemma). Let X be a connected topological space, 
IT: Y — X be a covering and y: [0,1] — X a path such that 7(0) = y(1) = 
Xo. 

If yo is in the fiber over xo (i.e. IT(yo) = x0) then there exists a unique 
path Fy. : [0,1] + Y such that Fy,(0) = yo and IT oF, =7¥.- 


With this lifting, we can define a group action on the fiber. 


Proposition 9.3.7. Let X be a connected topological space, let IT: Y — X 
be a N-fold covering, and x9 a point of X. We denote by F the fiber over xo 
(i.e. F = II~‘(a9)). We have a group action: 
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m™1(X,%9) X FOF 
(Iy], Yo) > Fyo(1) 
and a group homomorphism: 
1™1(X, 29) > Aut(F) = Gyn 
where Gyn is the symmetric group. 
Now we give two useful lemmas about analytic paths. 


Lemma 9.3.8. Let y be a closed path in C—{pi,...,pa}. Then y is homotopic 
to an analytic closed path 6. 


Pe 


Fig. 9.3. y is homotopic to the analytic path 0. 


Proof. We have a continuous map 7 : [0,1] —~ C — {p1,...,pa} such that 
(0) = 7(1). We set € = {f € C°((0,1],C)|f(0) = f(1)}. Furthermore S$ = 
Span({e2"""®|n € Z}) is a dense subset of € for the ||.||.. norm (||flloo = 
SUP z¢(0,1] |f(2)|)- 

Now the distance between y((0,1]) and {p),..., pa} is strictly bigger than 
0, because these two compact sets are such that ({0,1])M {pi,---,pa} = 0. 
So we set d(7([0,1]), {p1,---,pa})/4 = € and we have « > 0. 

Because of the density of S there exists a sequence (fn)n € S with the 
following property: there exists a number N such that for all n > N we have 
lly — alloc < €. We set 4(¢) = fin(t) — frv(0) + 7(0). Then, 


|6(t) — (loo < WF (4) — 7) Ihoo + Il fo (0) — (0) |] < a(y([0, 1]), {p15 --- Pa}). 
So 6 is homotopic to ¥, 6 is analytic and 6(0) = 6(1) = y(0) = y(1). 


Lemma 9.3.9. The lifting (7) of an analytic path y in Theorem 9.8.6 is an- 
alytic. 


Exercise 9.3.10. Prove Lemma 9.3.9. (Hint: Use the implicit function theo- 
rem.) 
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Irreducibility and path connected spaces 


Theorem 9.3.11. Let P(X,Y) be a square-free polynomial of C[X, Y], 

A = {z|Disey(P(X,Y))(xz) =0}, and C = {(z,y) € C?|P(a,y) = O}. 
Then: 

P is irreducible in C[X,Y] — + C —pr7'(A) is path connected. 


We need two lemmas to prove this theorem. 


Lemma 9.3.12. Let IT: Y — X be an n—fold covering, and Y; be a connected 
component of Y. Then II;y,: Y, — X is a d—fold covering with d <n. 


Exercise 9.3.13. Prove Lemma 9.3.12. 


Lemma 9.3.14. Let P(X,Y) = Y" + a,(X)¥"-!+4---+4,(X), « € C, 
and y(x) be a root of P(«,Y), then |y(x)| < max(1, 57}, |ai(x)|) < 14+ 
Yat |ai(x)I. 


Proof. If |y(a)| <1 then the lemma is true. 

If |y(x)| > 1 then P(x, y(x)) = 0 thus y(x)” = ay(x)y(x)"-1 +--+ +an(z). 
It follows that |y(x)|" < D054 |ai(z)|ly(@)/""* < OF, lai(2)|Iy(@)|"*. So 
ly(x)| < DF lai(z)|- 


Proof (Theorem 9.3.11). =) We suppose that P is irreducible. First we re- 
mark that C — pr; '(A) is locally path-connected because C — A is locally 
path-connected. So it suffices to show that C — pr; '(A) is connected. 

Let C, be a connected component of C — pry '(A). Lemma 9.3.12 implies 
that prijo, :C, + C— A is a d—fold covering with d < n. Thus if we show 
that d = n then we have C, = C — pr; '(A) and we are done. 

For every to € C — A, we have PTijo, (0) = {yi(Xo),---, ya(vo)} with 
yi(%o) A yj (Xo) when 7 # 7. As pryc, is a covering we have a neighbor- 
hood V,, of 9 such that y1,...,yq are defined on V,,. Furthermore, we have 
Pro, (x) = {yi(x),..., ya(x)} for every x € V,, . Hence y; is analytic on Vz, 
by the implicit function theorem applied to P(2o, yi(%o)). 

We consider the polynomial 


(Y —yi(x))...(¥ — ya(x)) = V4 + Si(a)V¥* 1 +...+ Sa(z). 


The S;(a) are the elementary symmetric functions in y;(a). S;(a) is defined on 
V,,, and we now see that $;(x) is a polynomial. First, we show that S;(2) is 
defined on C — A, secondly we show that S;(a) is defined on C, and bounded 
by a polynomial. 

Let 21 4 Xo, as before there exit a neighborhood U,,, of x; and d analytic 
functions y1,..., a, such that Pic, (x) = {yi(a),...,pa(x)} for every x in 
Viz,- If Vz, 1 Ve, # @ then, as prijc, is a covering and y; and y; are sections 
of prijc,, we have an element o € Sq such that y; = Y~oQ) on Vey N Vz. 
Therefore, we have, for example 
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d 


d d 
3i(x) = Dd yi(x) = » voix) = D> vi(a) 


i=d 


on Vz, Vz, thus $;(x) is defined and analytic in V,, U Vz,. So if we repeat 
this procedure we get d analytic functions S;(x) defined on C — A. 
Now we have to prove that S;(x) is defined on C. Let us do it for Sj(z). 
Let xo be in A, and let (€,,), be a sequence of elements in C such that 
limp sco €n = 0. 


Then 
d 


lim $1(%o + €n) = lim S- yilao + En) 
i=1 
with y;(x) well defined and continuous in 2 (as they are the roots of the 
polynomial P(x, Y)). We get 


d 
Jim, Sy(x9 + En) = ¥ yi(Xo) 


i=l 


(here there exist ig and jo such that y;(%o) = y;(xo)). Thus there is no sin- 
gularity on A, and we can extend analytically 5; to C. We proceed similarly 
for all Sj. 

Let w be in C, and y;(a) be a root of P(#,Y). Lemma 9.3.14 implies 
lys(a)| < 1+ =i |a;(x)| ; this means that |S;(x)| is bounded by a polynomial, 
then Liouville’s theorem implies that $;(a) is a polynomial. 

Therefore, P\(X,Y) = Y¥44+ Si(x)Y4-1! +---+ Sa(X) belongs to C[X, Y]. 
Now we perform the Euclidean division of P by P;, in C(X)[Y]. As P; is monic 
we get P(X,Y) = A(X,Y)Pi(X,Y) + R(X,Y) with A(X,Y), R(X,Y) € 
C[X,Y] and R(X,Y) = ra-1(X)Y4"1... + ro(X). For every x ¢ A, we set 
{yi(2),---,4a(a)} = {ylPi(a.y) = 0}, where ys(x) 4 yj(a) if i # 7. AS 
(x, yi(z)) € Cy CC — pr, '(A) we have P(x, y;(x)) = 0 for i = 1...d. Hence 
R(a, yi(x)) = 0 fori =1...d, and thus R(x, Y) = 0 in C[Y] for every x ¢ A. 
So rg-1(X) =... = ro(X) = 0 in C[X], and then P, divides P. Now as P is 
irreducible, it follows that P, = P and then d= n. 

<) We suppose P = P,- P2 where P; is irreducible in C[X, Y], and P,; 4 P: 
because P is square free (if we have more than two factors the proof is similar). 

We set V(P;) = {(x,y) € C?|P;(a, y) = 0} and C; = V(P;)N(C—pr7'(A)). 
C; is a closed subset of C — pr; ‘(A). 

Furthermore, Cy; and Cy are distinct, because pri(V(P1) MN V(P2)) Cc 
Discy(P). Indeed, we have 


Discy(P) = Discy(P,) - Discy(P2)- Resy(P1, P2)?. 


So we can conclude that C — pr; '(A) = C1, UC», and then that C — pr> (A) 
is not connected. 
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Double point set and transpositions 


Lemma 9.3.15 (Change of coordinates). Let P(X,Y) € Q[X,Y] be of 
total degree n. We consider the change of coordinates fy(X,Y) = P(X + 
AY, Y). 

Let U be the subset of Q such that for every \ in U we have 
degy(f,(X,Y)) = n, and there exists xp in C such that the polynomial 
fy(a0, Y) € C[Y] has one root yo of multiplicity two, all the other roots have 
Fx (to Yo) #0. Then there exists a finite subset F of 
Q such that U = Q-F. 


multiplicity one, and 


(X? + Y?)? —4x?y? =0 ((X + 4Y)? + Y7)? —4(X + 4Y)?Y? =0 


Fig. 9.4. Examples of a bad case fo(X,Y) = P(X,Y) = 0 and of a good case 
fis2(X,Y) =0. 


Proof. First we show that V = {\ € Q|degy-(fx(X, Y)) 4 n} is a finite subset 
of Q. We set P(X, Y) = Visicn ap X*Y" ; then, for an(A) € Q[A] we have 


A(X, Y) = aut (*) ave ‘yt — 
0 


k+l<n i= 


= an(A)Y™ + an—1(X, AJY™ + +--+ + ag(X, A). 


Thus V = {Alan (A) = 0} hence V is finite. 

Now we consider dy(A, X) = Discy(fy(X,Y)) € Q[A, X]. If (uo, vo) is a 
singular point of P(X,Y) then (ug — Avo, v9) is a singular point of f,(X,Y). 
So dy(A, ug — Avo) = 0 and X — (xp — Avo) divides d,(A, X). We denote by 
(u;,v;), for i = 1 to d the singular points of P, then 
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d 
dy(A,X) = ][(X = (ui — dv))* (A,X) 
i=1 
and we set 
dx(A) = Discx(q(A, X)). 
Now we claim that if \ is not a root of dz, and if the change of coordinates 
preserves the degree in Y of P, then A belongs to U ; this will prove the 
lemma. Indeed, we choose (Ao, 20) in the following way: 


d2(Ao) #0 (1) 
di(Ao, Zo) = 0 (2) In order to satisfy (2) and (3), we choose a root 
xO x Ui Aovi (3) 


xo of q(Ao, X) which is not u; — Aovi. 
Now we consider the root y;(X) € C[X] of the polynomial 
fro (X,Y) € C[X][Y] and we get di(Ao, X) = [Ti4;(yi(X) — yj(X)). So there 
exist ig and jo such that y;(a%9) = yj,(@o0) with ip F jo (because of (2)). There- 
fore d2(Ao) 4 0 so q(Ao, X) does not have a multiple root, and 82 (Xo, xo) #0, 
since po # Uz — Avi. 
Furthermore, 
0 _ Yio OY jo 
Feo X) = (Hex) = Heo TTX) = 0) 
(1,5) A(to,Jo) 
5 tAj 
+(Yio(X) — Yiol( A) Ze TT il) — (X))). 
(4,)A(io Jo) 
iFj 
Thus a4 (Xo, Xo) # 0 implies that for all (k,l) 4 (io, jo) and k #1 we have 
Yr(xo) # y(Lo). 
Then we conclude that f,,(¢%o, Y) has n — 1 distinct roots, and one root 
has multiplicity two (y;,(@%0) = Yjo(Xo))- As Fh9 (20, Yio) # 0 (because x is 
not the abscissa of a singular point) the claim is proven. 


Theorem 9.3.16. Let Xo be as in Lemma 9.3.15, A = Discy(fy,(X,Y)). 
Then there exists Xp € C— A, y a path in C — A such that the monodromy 
action relative to Xo of y on the fiber fy, (Xo: Y) = {21,...,2n} is for io A jo: 


[1] -2%o = Zo 
[1-270 = Zio 
[y].210 = zi Ut Ato andi F jo. 


Proof. a) Let Xo, Zo and yo be as in Lemma 9.3.15. We have: 


Ona or OFdo 
fro (0, Yo) = ibs (x0, yo) = 0, (00,40) # 0 and “h0 (x0, y) # 0. 


We denote by y3,...,Yn the simple roots of fy,(%o0, Y). We have 


Vi 2 3 fro (0, Yi) =0 
; O 
Yi>3 Po (ao,y,) £0. 
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Fig. 9.5. The monodromy action of y gives the transposition (yi yz). 


So we can apply the analytic version of the implicit function theorem to 
every (x9, y;)- Thus, there exits a neighborhood V,, of yo, a neighborhood U?, 
of xg, and an analytic function y such that 


rE Uae EV, and fy(a,y) =0 — = x= ly) and y € Vy. 


For 7 > 3, there exists a neighborhood Ui, of xo, Vy, a neighborhood of x, 
and p; an analytic function such that: 
rE Ug Ee Vy, and fy,(z,y) =0 — > y= p(x) anda e€ Use 
Now, we consider the parametrization x = g(y). We have y(y) = 
xo + aly — yo) + b(y — yo)? +... in a neighborhood of yo, with a 
TX 


— By (20: 90) (Sy (0, yo))~ =0, and 


O . ; 
b= 3 Oy? (xo, 40)( G22 (00,40)? # 0. Thus in a neighborhood of (29, yo) 


we have 


I 


(a — 20) = (y— yo)*[b+ 9 ax(y — yo)*)- 
k>1 


b) The equation z? = b has two distinct nonzero roots r; and rz. Near 
r, #0 the function c: C — C given by z+ 2” has an analytic inverse because 
c'(r1) = 2r, # 0. Let r be this inverse, r : V, — W,,. Hence, in a neighborhood 
Vi) of yo we define: R : VO? = W,, : yt r(b+ > ps1 a(y — yo)*). Thus in 
a neighborhood of (ao, yo) we have: « — xo = (y — yo)?(R(y))?. We denote 
by w the following map: wv : V2) — Vo given by y > (y — yo)R(y) where Vo 
is a neighborhood of 0, and vie isa neighborhood of yo on which w is an 
isomorphism (this is possible because ~'(yo) = R(yo) = r(b) 4 0). We denote 


366 G. Chéeze & A. Galligo 


then by € the inverse of 7. Thus we have a neighborhood V,, of yo and a 
neighborhood U,, of zo such that 


(x) «2 E€U,,,y € Vy and fy,(t,y) =0 — rx-a9= w(y)? and y € Vuyo 


Now we remark that yo is a root of multiplicity two of (w(y))? =0. w is anon 
constant analytic function on Y,,. So we have established (see [Car61] p 97, 
Prop 4.2): 

(**) There exists V/, a neighborhood of yo, U/;,, a neighborhood of a9 such 
that for all Xo # xo and Xp inU/,,, (~(y))? = Xo—2o has exactly two distinct 
simple roots in Vy, 


We set V = (Nis3Ui,) NUcy NU}, ; this is a neighborhood of ao. Now we 


choose a real number p > 0 such that Bao, p) C V and B(0, \/p) C Vo. 

c) Lifting paths 

We set: Xp = ro + p andy: [0,1] - C-A: tb 29+ pe”*™. Thus 
fro (Xo, Y) has n distinct roots: z1,..., Zn. Now we write all these roots with 
€ or pj. If Xo € V and y € Vj, then fy,(Xo,y) has two distinct roots 2 
and z) in Vj), by (**) and by (*) we can set ~(21) = \/p and W(z2) = —\/p. 
Hence z1 = €(,/p) and z2 = €(—,/p). Furthermore, Xp € V and y € Vy,, 
frx(Xo,y) =0 => y=pi(Xo) for i =3...n. So we set z; = p;(Xo). Now we 
lift y above z, and z2. We set 


q(t) = E(Vpe'™") 5 ya(t) = &(-Vpe™™). 


These two paths are well defined, continuous and 7;(0) = 2. 

For all t € (0, 1], y(t) — 9 = pe”’™* = [p(E(,/pe’™))]? because of = id, 
then for all t € [0, 1], 7(t) — 20 = [Wn ()2 = vel). 

By (*) we get f(y(t), %:(t)) = 0, Vt € [0, 1]. Thus 7 lifts 7 above z and 7.2 
lifts y above z2. As y1(1) = zg and y2(1) = 21 we get: [7].21 = 22, [y]-22 = 21. 

Therefore we set 7;(t) = pi(y(t)) for i = 3...n. We have 


Fy, %() =0,Vtie (0, 1], 


by the definition of p;. Now, 7% lifts y above z; and y,(1) = pi(y(1)) = 
pi(y(0)) = z;. Hence, for i = 3,...,n, [y].2; = 2. 


Transpositions will play an important role for the proof of Harris’ lbmma 
and its generalizations (see Theorem 9.3.20). Theorem 9.3.16 will be used with 
the following lemma. 


Definition 9.3.17. Let Gx X — X be a group action. If for every two pairs 
of points 21,22 and y1,y2 (t1 # Xe and y, # Yy2) there is a group element g 
such that g.x; = yi, then the group action is called 2—transitive. 


Lemma 9.3.18. Let G be a subgroup of G6, such that the action of G on 
{1,...,n} is 2—transitive and such that there exists a transposition T in G, 


then G = Gy. 
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Proof. We can suppose T = (1 2). Let 2 and y be two distinct elements of 
{1,...,n}. There exists a permutation o € G such that o(1) = x and o(2) = y 
(because the action is 2—transitive). Then o~'ra = (o(1) o(2))=(@ y)E 
G. Therefore every transposition belongs to G ; this implies that G = S,,. 


Monodromy and genericity 


We saw in the last subsection that when P is irreducible, the monodromy 
action on the fiber is transitive ; that is, any two points y; and y; of the 
fiber #~+(xo) can be exchanged following a continuous path on the curve 
on top of some loop y. This result also expresses the connectivity of the 
subspace formed by the curve C minus the ramification points. In fact there 
is a stronger connectivity result which is a consequence of a lemma due to 
J. Harris (see [Har80] or [ACGH85]) which was originally used to establish his 
uniform position theorem on the generic hyperplane section of a projective 
curve. 

We will adapt Harris’ lemma to our setting in order to obtain what we call 
an Affine Harris theorem. This theorem says that if we perform a generic 
change of coordinates before taking the projection, not only the action of the 
monodromy group is transitive but any permutation of the point of the fiber 
¢@ '(a9) can be obtained following a continuous path on the curve on top of 
some loop y. This key fact and its application to absolute factorization was 
first observed by Galligo, stated in [GW97], then in [Gal99] and in [Rup00}, 
[Rup04]. However in these papers it was just indicated that this statement was 
a consequence of Harris’ lemma and classical arguments in algebraic geome- 
try. As Sommese-Verschelde-Wampler needed this statement to improve their 
algorithm (see below), in [SVW02c] they gave a more complete proof of it 
and made precise references to two textbooks ([ACGH85] and [GM83]). In 
the next subsection, we will give a detailed exposition of this result. Let us 
start by reviewing Harris’ lemma and its proof in the case of plane curves. 


Lemma 9.3.19 (Harris’ lemma). Let C be an irreducible projective plane 
curve, possibly singular, and call n its degree. Let U be the Zariski open subset 
in P?(C)* of lines transverse to C, i.e. cutting C inn simple points. Consider 
the incidence correspondence graph I and its second projection: 


pro: L={(p,H)EeCxU | pe}. 


Then pr2 is a n-fold topological covering. We fix Ho € U and let Ig denote 
the set of n intersection points CM Ho. Then the monodromy map 


™1(U, Ho) > Gn 
as surjective. 


Proof. Let G be the image of the monodromy map. We know by application 
of Theorem 9.3.11 that C minus its singular locus is path connected. As by 
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Theorem 9.3.16, U contains in its border a line H, which is tangent to C 
at a simple point and transverse to C at all the other n — 2 intersection 
points, we deduce that G contains a transposition. So in order to apply Lemma 
9.3.18 we need only to prove that G is 2-transitive. To express this property 
geometrically let us set 


In ={(pi,p2,H)€CxCxU | pi € Hype € H,pi # po} 
and similarly 


Jo = {(p1,p2, H) € C x C x F°(C)* | p. € A, pe € A, py py}. 


With this definition, obviously Jz is a line bundle over C x C — A where A is 
the diagonal of C x C (indeed two distinct points define a line). Now A is a 
complex subvariety of C x C of strictly smaller dimension, and C x C is path 
connected because C is path connected. Therefore J2 is also path connected. As 
I is obtained from Jz by subtracting a complex subvariety of strictly smaller 
dimension, it is also path connected. This implies that G is 2-transitive. 


The Affine Harris theorem 


Theorem 9.3.20 ([GW97]). Let P € Q|X,Y] be a an absolutely irreducible 
polynomial of total degree n. Let C be the corresponding affine curve in C?. 
Then there exists a Zariski open set of affine changes of coordinates such that, 
the projection on the new first coordinate x: 


prm:C-C 
induces on the fiber pri—'(0) a monodromy map 


m(C = A, 0) = 6G, 
which is surjective. 


Proof. The theorem is a corollary of Harris’ lemma and a classical theorem 
of van Kampen, recalled below as Theorem 9.3.21. With the notations of the 
last subsection we identify C — A with the set of all lines in C? parallel to the 
Oy-axis and transverse to C. Then we include this set in the intersection of U 
with the line of P?(C)* formed by all the lines passing through the point at 
infinity corresponding to the Oy-axis. Moreover we suppose that O is not in 
A and choose Hp = Oy. Then to prove the theorem, it suffices to show that 
the induced group homomorphism 


m1(C = A,0) = m1(U, Ho) 


is surjective. 
We view U as the complement of a reduced projective curve in the dual 


projective plane P?(C)*, which is isomorphic to the usual projective plane 
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P?(C). Then our theorem will be a consequence of a classical theorem of 
E. van Kampen in 1933. A short, rigorous and self-contained exposition of 
this last result was given in a paper by D. Cheniot in 1973. It contains a 
precise description (by generators and relations) of the fundamental groups. 
We summarize it as follows. 


Theorem 9.3.21 (van Kampen). Let H be a reduced algebraic curve of 
degree n in P?(C) and A be point not on H. Let L; fori =1 tom and Lo be 
all the line passing through A and not transverse to H. Call A; fori =1 tom 
and Axo their direction in a P1(C) complement to A. Let L be a line passing 
through A and transverse to H, and call X its direction. Then 


my (P4(C) — {Ag,. «+4 Ags Aco}, A) mi (P?(C) — H, A) 
is surjective. 
With our notation, we get 
my (P1(C) — {A1,..., As; Aco}, A) = 71 (C — A, 0) 
and we are done. 


Remark 9.8.22. It would have been more elegant to provide an algebraic proof 
of our Affine Harris theorem. A natural way to do this, is to adapt the proof 
recalled in the last section, by using Jouanolou’s version of Bertini’s theorem 
applied to the algebraic set Iz. However, this only proves a weaker version of 
our claim. Indeed, instead of obtaining the monodromy map associated to the 
lines parallel to a generic direction Oy, we get the monodromy map associated 
to the lines passing through a generic point of P?(C) and we cannot be sure 
that we can choose such a point on the line at infinity. So we are led to rely on 
a topological analysis, which in this situation gives more precise information. 


Composite Monodromy 

To validate the Galligo-Rupprecht algorithm, a result for the composite 
case is needed. 

When P has several factors, then C has several irreducible components 
C,...,Cs, and each of them has a monodromy action. So we can relate the 
monodromy of C to the monodromies of the C;. The result obtained in [Gal99] 
and in [Rup00], says that after a generic change of coordinates, the following 
group homomorphism is surjective: 


m1(C — A) + Gy, X Gn, X ++: X Gp,. 


This result is a straightforward corollary of the proof of Theorem 9.3.20 that 
we explained above. 
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9.3.2 The Galligo-Rupprecht’s algorithm 


The algorithm takes as input a bivariate polynomial with rational coefficients 
irreducible over Q, performs a generic affine change of coordinates in order 
to get a new polynomial P(X,Y), and outputs an approximate absolute fac- 
torization of P. This will be used in the next section to produce an exact 
absolute factorization of P. 

The algorithm first finds a key combinatorial fact about the target factor- 
ization: the partition generated by the factorization on a smooth generic fiber 
POY) 0). 

This is obtained by analyzing the restriction of the factorization modulo 
X° which gives so called ’Zero sum’ relations (see below for the history of 
this concept). Our Affine Harris Theorem proves that these relations indeed 
provide sufficient and necessary conditions for the absolute factorization of P. 
The corresponding factorization of P(0, Y) is later lifted to a factorization of 
P by Hensel liftings. In order to compute efficiently these ’Zero sum’ relations 
and Hensel liftings we rely on good Newton approximation of the roots of 
P(0,Y) =0. 

So, first we perform some reductions on the input polynomial in order to 
get a monic square-free polynomial which is irreducible (over Q). We consider 
“eeneric” affine change of coordinates in 2 variables 


X=x+ay+b; Y=yrte. 


In practice, this means a change of coordinates whose coefficients (a, b,c) are 
decimal numbers provided by a “random function” that one can find on any 
computer. 

Simplifying, we get a new monic polynomial in Q|[z, y], that we call P : 

y” + Gn—1(x)y"* +--+ + a9(x) with deg a;(x) < n—-i. 

A consequence of the fundamental Lemma 9.0.8 is that the factors all have 
the same degree. 

As there are efficient algorithms for the detection of factors of degree 1, 
we suppose that the degree of the factors of P is greater or equal to 2. This 
assumption will be used in Lemma 9.3.23. Now we describe the main ideas 
behind the Zero-sum relations. 


Definition of the numbers b; and their properties 


Let P be a square-free polynomial in Q[X,Y] of total degree n, monic in 
Y. For a € Q, we denote by yi(2o),.--,Yn(Xo) the roots of P(vo, Y). Then 
for all but at most n(m — 1) values of ao, these roots are distinct and the 
curve defined by P is smooth at the points (9, y;(a)), for i = 1,...,n. If we 
choose such a value for xo, then there exist analytic functions y;(X) in the 
neighborhood of xo (for i =1,...,n) such that 


{ yi(ro) = yi(Zo) 
P(X, yi(X)) = 0. 
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There exist complex numbers a; and 6; (for i = 1,...,) such that 


yi(X) = yi(ao) + a;(X — x9) + 0;(X — 2p)? +--+ 


If 
OP OP 
a(xr,y) = Dy 9) B(xz,y) = By ey)» 
o?P 0?P o?P 
V(z,y) = Dae (OY) d(z,y) = Dye (my) e(z,y) = andy Y) 


then we have 
a(xo, yi(Xo)) 


B(xo, ¥i(%o)) 


a= and 


be : (alos (t0)) + 22(20, ws(ao) as + 5(¢0, a6(00))22) 


~ 28x, yi(wo 
We use these formulas to introduce analytic functions a and b defined on 
C—pr,'(A) which is a n-fold covering of an open subset of the complex plane 
(see Subsection 9.3.1). We set 


a a(x, Y) 
BAT xy) 
b(X,Y) = ea (7(X,Y) + 2e(X,Y)a + 6(X,Y)a) € C(X,Y). 


Let U be a (small) open neighborhood of xo in C where all the y;(X) are 
defined for i = 1 to n. Then denote by V = UV; its inverse image by pr; '(U) 
in C — pr; (A). We also consider the restrictions of a(X,Y) and b(X,Y) to 
each V;, and set b;(X) = b(X, y;(X)). 

As P(X,y;(X)) = 0 on U for all 7, and P is monic in Y, we can write 
P(X,Y) = [ft 4(¥ — 9:(X)). 

For each k = 1,...,8, the factor P, in the factorization P(X,Y) = 
TI,_1 Px can be written 


PAX,Y) = J] -9(X). 


j= 


The total degree of P;, is m so we can write: 


P(X, Y)=Y¥™ + (qi (X))Y¥" 1" 4+ (XY? +--+ an(X) (2) 


where g;(X) € Q[X] and deg(q;(X)) < j. In particular, deg(qi(X)) < 1 
so the coefficient of its degree two term is zero. From (1) and (2), we get 


Sear yp; (X) = m(X), thus 
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So we have found a necessary condition on each factor P;, of P. Our aim is to 
prove that such a Zero-sum relation is also a sufficient condition to characterize 
a factor of P. The next lemma is an intermediate result in that direction. 


Lemma 9.3.23. If P has no factor of degree 1, then for almost all values of 
Xo, the numbers b; are all non-zero (fori =1,...,n). 


Proof. We consider the analytic function on U defined by P = [[j_, bi(X) = 
TI (X, vi X)) € CCX, g1(X), «sn X))- 

P is asymmetric rational function of the y;(X). Thus by definition of the 
ypi(X), we deduce that P is a rational function of the coefficients of P as a 
polynomial in Y. So P belongs to C(X). 

If P £ 0 in C(X) then for almost all xp in C, P(a) 4 0. Thus 6;(20) = 
b(xo, Yi(%o0)) # O for all i, and we are done. 

If P = 0 in C(X), we shall see that we get a contradiction. 

As P(x) = []j_, 6(2, yi(x)) = 0 in V and b(2, y;(x)) are analytic functions 
on U, there exists an index ig such that b(x, y;,(%)) = 0 on U (because the 
ring of analytic functions on U is an integral domain). 

This implies yj (x) = 0 on U then g(a) = 0 on U for every r > 2. 

Thus Yio (£) = Yio + Yj, (%0)(Z — Lo) on U. 

We perform the Euclidean division of P(X, Y) by 


F(X,Y) = Y — yig(wo) — ¥5,(#0)(X — 20) 
in C(X)[Y]. Since F(X,Y) is monic in Y, we get 
P(X,Y) = A(X, Y)F(X,Y) + R(X), 


with A(X, Y) € C[X,Y] and R(X) € C[X]. 

Therefore, for every « € U as F(x, yi,(a)) = 0, we have R(x) = 0 then 
R(X) = 0 in C[X]. This implies that P has a factor of degree 1, contrary to 
hypothesis. 


Now we can prove the theorem: 


Theorem 9.3.24. Let P be an irreducible polynomial of degree n. Consider 
P(X, Y,A) = P(X + AY, Y). Then for almost all specializations (a, Ao) of 
(a, A) in C x Q, none of the sums )0,<,b:, for J & {1,...,n}, vanishes. 
Remark 9.8.25. In the previous statement, “almost all” means that we have 
to avoid a finite number of zo and a finite number of Ao. 
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Proof. First we perform a change of coordinates, and we choose Ag such that 
the conclusion of the Affine Harris Theorem 9.3.20 is true. 

We denote by fy,(X,Y) the polynomial P(X + AY, Y). Hereafter, the 
numbers 0;, y;(%9) and the rational function 6 are related to f\,. We set 
m<on. 

Now we want to prove that for almost all 29, we do not have >),-;b; = 0 
with J & {1,...,n} and |J| = m. We consider the functions By = )>,-, bi(X), 
and B = Ile, By where 

Em = {{a(1),...,a(m)}|o € Sy}. 

B is arational function in X,y1(X),..., ¢n(X), and is symmetric with respect 
to these last n arguments. So as in Lemma 9.3.23, we have that BG is a rational 
function of X. We want to prove that B 4 0 in C(Xo). For this, we suppose 
that B = 0 in C(Xo), and show that we get a contradiction. Choose 29 ¢ A = 
Discy (fx,(X, Y)) such that for i =1,...n,b;(ao) £ 0 (*) (this is possible by 
Lemma 9.3.23). Then for every x in U (the neighborhood where all the y; are 
defined), 


B(x, yi(x)) = F(a, gi(a)) #0 fori=1,...,n. 
Therefore b(x, y;(a)) is well defined and analytic on U, thus B;(z) is analytic 
on U. As B = 0, there exists a set Jo such that B;,(x) = 0 on U. We can 
suppose Jp = {1,2,...,m}, and we set Jo = {2,3,...,m—1,m,m+ 1}. 

By the Affine Harris Theorem 9.3.20, there exists a closed path y such 
that [y] acts on {y1(%o),-.--,Yn(@o)} as the transposition (1 m+ 1). That is 
to say: if we denote by 7,, the lifting of 7 above y;, so Vy, = (7, dy,) where 
and dy, are analytic (see Lemma 9.3.9) then 

by,(0) = y; fori =1,...,n 
by, (1) =y ift Alandi#Am+1 
dy,(1) = Ym+i and dy,,,,(1) = m1. 

Now we set H(t) = 57\", b(7(t), dy, (t)). This is an analytic function on 
]0,1[ such that H(0) = By, (ao) and H(1) = By, (x0). 

As H(t) = By,(y(t)) for t € y~1(U) and By, = 0 on U, we get H = 0 
because H is analytic. Thus B;,(%0) = Bj,(%0). This implies 6; = bm4i. We 
can do the same thing for all the other indices, so we have 6; =... = by. 

Finally the necessary condition )>;"_, b; = 0 gives b; = 0 fori =1,...,n, 
and this leads to contradiction (see (x)). 


With the same method of proof, we get the following theorem in the re- 
ducible case: 


Theorem 9.3.26. Let P be a polynomial of degree n and let 

Q(a,y,A) = P(a+Ay,y). Then for almost any specialization (ao, Ao) of (x, A), 
the sums Doc, bi, with J in {1,...,n}, vanishes if only if it corresponds to 
the union of roots of a family of factors of P. 


The factorization algorithm uses also another similar generic property of 
the number };: the number 0; (defined in Section 9.3.2) are all different. See 
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Terms of higher degree in the expansion of y; 

We have seen that after a generic change of coordinates, a vanishing sum of 
a subset of the numbers 0; (defined in Section 9.3.2) corresponds to a factor of 
P. We could also show the same result on other terms in the series expansion 
near a root y;: if we write 


pi(x) = yy + a;(% — &9) + bi(x — 29)? + ci(a — 29)? + °°: 


a factor of P (generically) corresponds to a vanishing sum of the numbers 
c;. This remains true for terms of higher order. This could be used to get a 
stronger certification on the partition of {1,...,n}. However it would take 
time to compute these higher degree terms and their sums, and the condition 
on the numbers 0; is strong enough to discover the relation. 


The Algorithm 


This algorithm was implemented. The implementation made by D. Rupprecht, 
is written in C using the PARI library for multiprecision computation and 
computation in extensions of Q. The algorithm takes as input 2 constants 
prec, and precy. The first one is used to test if a number is equal to 0 (if its 
absolute value is lower than 107?"® then the number is 0). The other one 
precz is the number of digits for computations. 
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Algorithm 9.3.27 THE GALLIGO-RUPPRECHT ALGORITHM 
Input: P(X,Y) a square-free polynomial in Q[X, Y], irreducible in Q|X,Y]. 


1. After a generic linear change of coordinates X —x+X.y+2%,Y -—y 
(with rational coefficients), we get a monic polynomial in y, denoted by 
Q(x, y)- 

2. Compute numerically the roots y1,..-,Yn of Q(0,y) and the second order 
coefficients b; defined above. 

3. Look for a minimal partition I,,...,Im of {1,...,n} such that card (I) = 
--- = card(I,) and By, = aren b, = 0 fork =1,...,m. Denote by Ry 
the polynomial Ri = [[jc7,(y — yj). We have Q(0,y) = Ri-++ Rm. 

4. By the last theorem, the polynomials Ry correspond to the trace of the 
factors of Q for x = 0. Performing Hensel liftings on these polynomials, 
one obtains new polynomials Q, such that 


{ Q=O)+9Q,, med 
Qx(0,y) = Re(y) fork =1,...,m- 


5. Performing the inverse change of coordinates, one obtains numerical poly- 
nomials P,, ae Pn which provide a candidate for an absolute factorization 
of P. 

6. The last step of the algorithm is to find an extension of Q and conjugate 
polynomials P,,..., Pm where Py is a good approximation of Py. Finally 
one can test if P, is a divisor of P. 


Output: P;(X,Y) € Qla][X, Y], an absolute factor of P(X,Y), and the mini- 
mal polynomial q(t) of a over Q. 


Zero-sums By 


This problem is the difficult part of the algorithm. We have a set of complex 
numbers b,,--- ,b,, and we are looking for vanishing sums of these numbers. 
This combinatorial problem could be solved by an extensive search among all 
the 2” sums. For n = 60, we would have to compute more than 10'% sums (or 
keep in memory some of these sums). D. Rupprecht [Rup00] proposed several 
improvements for detecting vanishing sums ; then the complexity that he got 
for this step is O(2”"/*). With this we can easily factorize polynomials up to 
degree 80. 


Zero-sums and the knapsack problem 


We write the problem of the vanishing sums in the following way. Let vy = 
(1,0, aoe , 0, R(b1), S(b1)), 122, US (0, eee ,0,1,0, see , 0, R(b;), S(b:)), 220g Un = 
(0,...,0,1, (On), S(bn)) be n vectors of R"*? (R(z) is the real part of the 
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complex number z and S(z) is the imaginary part). We consider the lattice 
£ generated by these vectors. A zero-sum }°,-,;b; = 0 corresponds to the 
“small” vector of L£: 


Yo = Atyecey Any ) ROG), Sb) = Cay spn, 0) 


tel wel ier 


where A; = 1 if i belongs to J and A; = 0 otherwise. We call these vectors 
zero-sum vectors, and the minimal zero-sum vector the vector corresponding 
to the minimal zero-sum relation (i.e. there is no subset J G I such that 
Dijer Oj = 0). 

Thus, instead of computing all the 2” sums, we can try to get a small 
vector of £ in order to obtain a zero-sum vector. Now we remark that the 
matrix with rows corresponding to minimal zero-sums vectors is in a reduced 
row echelon form. Indeed its n first columns contain, as entries, one 1 and 
otherwise zero, moreover the last columns are identically zero. 

Therefore we get the following method to obtain the zero-sum relations. 
Firstly, we compute a new basis {w1,...,wW,} of £ with the LLL algorithm. 
Secondly, we take all the w; with small norm. More precisely, we set: 


L' = {willlwil] < B} 


where B is asmall real number. Therefore we compute the reduced row echelon 
form of the matrix whose i” row is the i*” vector of L’. This gives, if B is not 
too large, a zero-sum vector. 

This method is very close to the algorithm of van Hoeij which factorizes 
a polynomial f(X) over Z[X] [vH02] and has been developed by Chéze in 
[Ché04]. One of the ideas of van Hoeij is to use 0-1 vectors, instead of using a 


vector of coefficients of a divisor of f which can have much larger coordinates. 


9.3.3 Contribution of other authors 
Monodromy and homotopy methods 


The first use of a monodromy method to provide an algorithm for computing 
an absolute factorization was made by Bajaj et al. [BCGW93] in order to 
prove a complexity result. Their algorithm was never implemented because 
it amounts to considering all the loops around the points of the discriminant 
locus A. 

A.J. Sommese, J. Verschelde and C.W. Wampler developed a geometric 
method, in a series of articles, to separate the components of an algebraic vari- 
ety. Specialized to the case of a plane curve it provides a geometric algorithm 
for computing an absolute factorization. It is based on numerical computations 
and relies on so-called continuation or homotopy methods. This amounts to 
following, in C or C?, integral curves of some differential equation and avoiding 
singularities. 
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As in the last section, suppose the input is a reduced plane curve C given 
by a square-free polynomial P(x, y) of (total) degree n and monic in y. They 
consider a generic smooth section, say for « = 0, which consists of n simple 
points. So the question is to find the partition of this fiber by the irreducible 
components of C. 

The curve C is a ramified covering of degree n of C. Let A be the discrim- 
inant locus of the projection ¢. We supposed that 0 is not in A. As we have 
seen in a previous section, for any loop y in C — A, (starting and ending in 
0), following the roots of P(y(t), y) over such y, one gets a permutation of the 
fiber & 1(x0). 

Sommese, Verschelde and Wampler made the following important obser- 
vation. They considered a few random loops ¥ in C — A (starting and ending 
at 0), and noted that in general they generate enough permutations of the 
fiber ¢~'(ao) to recover the desired partition. So, in practice, they did not 
need to follow all the loops as in [BCGW93]. 

It is not at all an easy task to follow precisely such a path y and the 
n paths over it to get the corresponding permutation of the fiber. Indeed, 
if the chosen time step is too large near a value xo, then the computation 
can cause a confusion between the various roots of P(xo, y), and the obtained 
permutation could be false. Such scaling problems are really tough. They later 
improved their algorithm by using a criterion based on our Zero-sum method; 
see Chapter 8 in this book. 

They demonstrate on a large problem, coming from an application in 
robotics, that their strategy and implementation are efficient. Moreover, in 
[(CGKW02], R.M. Corless, A. Galligo, I.S. Kotsireas, and S.M. Watt proposed 
a combination of a homotopy method with the other two approaches, in order 
to diminish the potential risk of errors. 


Zero-sum relations and the Japanese school 


As above, the input is a square-free polynomial P(x,y) of (total) degree n 
and monic in y. Consider the fiber over 0 and the corresponding factorization 
into n linear factors in C[[a]][y}: 


P =[]-yi(2)). 


In [SSKS91], the authors became the first to develop an algorithm based 
on Zero-sum relation, a concept that they introduced. Sasaki and his cowork- 
ers also proposed an algorithm which proceeds as follows. They consider the 
(integer) k powers of the y;(x)* of the y;(2). Their sums are called Newton 
sums and are symmetric functions of the coefficients of P, hence are polyno- 
mials in x of bounded degrees. So if we denote by |f|; the sum of terms of 
degree > 1 of f then we have, for some well chosen d > n+ 1: 


|pi(x)* +... + Pn(x)*la = 0. 
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Conversely, as this is true for any polynomial factor of P, by searching for 
the same kind of zero-sum relations among a subset of the n linear factor of 
P on the ring of power series, we can find a family y;,(x),...,:,,(x) such 
that (y — yi, (@))---(y — ¢i,, (@)) is a polynomial factor of P. 

Each zero-sum gives rise to many relations on the coefficients of the n series 
yi(a). The first coefficients can be computed within some approximation, 
Sasaki and his coworkers derived a method using linear algebra for recognizing 
the indices 7 of a factor. The Japanese school has been successful in inventing 
original algorithms for that purpose. The algorithms of Sasaki et al. proceed by 
filling a matrix with numerical coefficients coming from k-powers of the series. 
Then they look for elements in the kernel of that matrix whose coefficients 
are only zero or one. They were able to provide fine error analysis for their 
method. 


Remark 9.8.28. The above analysis has been recently completed in a work 
of A. Bostan, G. Lecerf, B. Salvy, E. Schost and B. Wiebelt presented at 
ISSAC’04 (see [BLST04]). In this work, they use logarithmic derivatives (as 
in Gao’s algorithm), and zero-sum relations (as in Sasaki’s algorithm). In the 
special case of absolute factorization, their algorithm studies the subspace L, 
of C” x C[X, Y]n-1 given by: 


pA nce "OO: 


The idea here is, as in van Hoeij’s algorithm ([vH02]), to find 0-1 vectors 


EP pax ee ); such that P; = [][p_.(Y — p(X))* ey. Since we have 


9 = vel 00) 
P= aga 


P; 


(where P/ denotes the first derivative of P; with respect to Y), this leads 
to consider Lz. We denote by 2(LZ,) the canonical projection of L, to C”. 
In their paper, they prove that if o > 3n — 2, then with 7(L,) we can get 
the absolute factors of P, and furthermore that this method is equivalent to 
Sasaki’s. Finally, this method only uses Hensel’s lifting and linear algebra in 
order to get 0-1 vectors (el! tie - jel), This method works in general for fields 


of characteristic zero or at least n(n —1) +1. 


9.4 Lecture 4: Reconstruction of the exact factors 


Thanks to the results of the last section, we know how to compute an approx- 
imate absolute factorization. Thus here we are in the following situation: 

We have an irreducible polynomial P € Q|X,Y]. Let us denote by P = 
P,---+P, its absolute factorization. Let Qla] be the smallest extension of Q 
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which contains all the coefficients of the factor P,. Let P + P, wee P, be an 
approximate absolute factorization of P. By this we mean that Pie CX, Y] 
and the coefficients of P; are numerical approximations of the coefficients of 
P, with a given precision ¢. That is to say ||P; — P,||. < €, with respect to 
the norm |] >); ; @i,jX'Y? lloo = maxi,; |ai,5|.- 

A natural question is: Can we get an exact factorization from the approx- 
imate one? If it is possible, how can we find the minimal polynomial of a over 
Q, and how can we express the coefficients of P; in Qa]? 

We will answer positively if € is small enough. As the coefficients of P, are 
given with an error ¢, in order to find the minimal polynomial f, of a, (fa € 
Q|T]), we should have to recognize its coefficients which are rational numbers 
from imprecise floating point numbers. David Rupprecht gave a preliminary 
study of this problem in [Rup00]. Here we present a complete and satisfactory 
answer. We closely follow the exposition given in [CG03]. 


9.4.1 Notations and elementary results 


In all this lecture we have: P € Q[X,Y] and P= P,---P; in C[X,Y]. 

P; are irreducible factors of P in C[X,Y]. K is the smallest field which 
contains all the coefficients of P, ; this is a finite extension of Q. By the 
primitive element theorem we can write K = Q[a]. Let « € K, we denote by 
fz the minimal polynomial of x over Q. Ox is the ring of algebraic integers 
in K: if « € Ox then f,(T) € Z[T] and is monic. 

Let « # 0 be an element of K. We denote by m, the homomorphism of 
multiplication by x in K, by Penar(a)(T) the characteristic polynomial of m. 
and by Trx/g(x) the trace of mz. 

We recall that Penar(x)(T) = f* where k = [K : Q(x)] is the degree of K 
over Q(x). 

Let [K : Q) = s and (a1,...,2%5) be an element of K*. We define the 
discriminant discgg(#1,...,@s) to be the determinant of the matrix whose 
(i, j)-coefficient is Trx/g(xiv;) (for i,j =1,...,8). 

For the special case (1,a,a?,...,a@°~'), where a is a primitive element of 
K over Q, we set 

discx (a) = discxg(1, Q,... soe") 
and call this number the discriminant of a. If f(T) = T"+an_1T" 1+. ..+a9 
then we denote by Disc(f.) the number satisfying 


Res( fa, fx) = (—1)"-)/? Dise( fa) 


where Res is the resultant and f/ is the derivative of f,,. When a is a primitive 
element of K, we have 


discxjg(a) = Disc( fa). 
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9.4.2 The strategy 


Our aim is to compute the minimal polynomial of a primitive element a of K 
and then the coefficients of P,; in K. 

Our strategy is based on i Soni observations. 

het PAY) = ¥,, dt (uv) uy” Then we have (by the fundamental 
lemma): 


Pepar(ay”)(L) = [] (P - af) = T° + eT? +++ +00. 


i=l 


Furthermore if gcd(P, Poa ), 2 Pe, ay) = 1, then aft?) is a 
primitive element of K and consequently P.par(a"”)(T) = f(T). So it 
1 


is easy to obtain theoretically the minimal polynomial of a coefficient which 


is a primitive element of K. But in our situation we do not have exact data 
(uv) gsr), , aS”) + €, and 


, as ; we only have approximations qth) + €,... 
a Bond €on fie errors: |e;| < €. Expanding []}_,(T — a(t). — €;) we get 
T® + cs_1(6)T® * +++: + oe), 
and we have to recognize c; from c;(e€). 

However, since we do not have a bound on the denominators of the rational 
number c;, this might be hard. 

In order to avoid this difficulty, we show that we can restrict our study 
to a polynomial P(X,Y) € Z[X,Y]. Then we prove that the coefficients of 
P; are algebraic integers over Z. Therefore, the coefficients of the minimal 
polynomial will be integers, so it is easy to recognize them, and we can certify 
the result. 

In Section 9.4.6 we propose a method to obtain the expression of the 
coefficients of P; in K. We will use the fundamental lemma and adapted 
representations of these algebraic integers over Z. The algorithm has been 
implemented. The last subsection provides illustrative examples. 


9.4.3 Reduction to Z[X, Y] 


Let Q(X, Y) = VyLo Vj<0 Gin-iX7Y" be an irreducible and monic poly- 
nomial in Q[X, Y] of total degree n. Let d be a common denominator of the 
coefficients of Q ; that is to say dqj,n—; € Z. Then d”Q is irreducible in QLX, Y] 
and d"Q(X,Y) = iio j=0 @'Gj,n-iX7(dY)”*. Setting Z = dY we define 
P(X, Z) € Z[X, Z] by 


Z 
@OKY) = AQ (XZ) = 2" dain aXZ" ++ dag = P(X Z), 
Since d”Q(X, Y) is irreducible in Q[X, Y], d”Q(X, 4) is irreducible in Q[X, Z]. 


Hence P(X, Z) is monic in Z, irreducible in Q[X, Z] and belongs to Z[X, Z]. 
We state two lemmas whose proofs are obvious. 
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Lemma 9.4.1. Let Q(X,Y) be a polynomial satisfying the hypothesis of the 
fundamental lemma and d a common denominator of the coefficients of Q. 


Let Q(X, Y) = Qi(X,Y)---Qs(X,Y) be an absolute factorization. We set: 


mn ¥ _ ym ad waa q™ fi 


Then P(X,Y) € Z[X,Y] is irreducible in Q(X,Y] and monic relative to Y, 
and P;(X,Y) = d"Q; (X, x) € C[X,Y] are the irreducible factors of P in 
C[X, Y]. 


Lemma 9.4.2. Let K’ be the smallest field generated by the coefficients of Qy 
and K the smallest field generated by the coefficients of Py. Then K’ = K. 


So for now on, we can suppose that our input polynomial belongs to 
ZX, Y}. 


9.4.4 The coefficients of P; are algebraic integers over Z 


Here we first prove a lemma, then we prove the following theorem. 


Theorem 9.4.3. Let P € Z[X,Y] be monic in Y and irreducible in Q[X, Y]. 
Then it admits a factorization P,---P, in C[X,Y] which consists of polyno- 
mials whose coefficients are algebraic integers over Z. 


Lemma 9.4.4. Let a be an algebraic number of degree s over Q and let 
p(X) € Qlal[X] be an integer over Z[X]. Then all the coefficients of p(X) 
are integers over Z. 


Proof. We denote by | the degree of p(X) relative to X. We remark that 
Q(X)|[a] is an extension of Q(X) of degree s. Moreover, 

(*) all the conjugates of p(X) belong to C[X] and have degree 1. 

As Z[X] is an integrally closed ring we deduce (see e.g. [Sam67] p. 45) that 

(**) the coefficients of the characteristic polynomial P.nay(p(X)) of p(X) 
over Q(X) are in Z[X]. 

Let k = [Q(X)[a] : Q(X) [p(X)]]. We denote the conjugates of p(X) over 
Q(X) by q where i = eee and q, = p(X). Then 


k 
s/k 
Penar(p(X))(Z) = I] = qi)* 


is the characteristic polynomial of p(X). 
Now we prove by induction that all the coefficients of p(X) are integers 
over Z. We start by the leading term of p(X). We have: 


382 G. Chéze & A. Galligo 
Pehar(P(X))(Z) = 2° + ep a)Ze* | aie [a 


= Zo 4 Gy1(X) 22-1 +--+ + cg (X) 


with c;(X) € ZX] by (**), and deg(cs_;(X)) < il, because 
deg(qi) = deg(p(X)) = I by (*). Thus deg(cs_i(X)p(X)*~*) = deg(cs_i(X)) + 
(s — i)deg(p(X)) < Is. As Penar(p(X))(p(X)) = 0 in oF ], considering 
the term of degree Is, we get: A? + eae c(cs_i)Ay’ = 0, where le(c;) 
is the leading coefficient of cs(X), Ic(p(X)) and I is the set I = 
{i/deg(c,i(X)p(X)*~*) = Is}. 

The fact that all lc(c;) are integers implies that 4; is an algebraic integer 
over Z and therefore ,X! is an algebraic integer over Z[X]. 

To prove the other steps of the induction, we remark that p(X) — A,X! 
belongs to Q[a][X] and is an integer over Z[X], then we can repeat the same 
argument with p(X) — A,X! instead of p(X). 


Now we can prove the theorem. 


Proof. As in the previous section, Q[a] is the extension field generated by all 
the coefficients of P,, and the degree of a over Q is s. By Steinitz’s theorem, 
there exists an algebraically closed field K such that K D Q(X) D Z[X] and 


P(X,Y) =Y¥" + apa (X)Y"™ 1 +--+ +a9(X) = le —7r;(X)) 


i=l 


where r;(X) € K and rj;(a) is an algebraic integer over Z[X]. To be more 
precise in the description of r;(X), there is an integer p such that r;(X) € 
C[[X'/?]]; see, e.g. [Eis95, p.300], Corollary 13.16. 

Since P,(X,Y) is a factor of P(X, Y), then we have 


Pi(X,¥) = []Y - r(X)) =¥™ + pra a(X)Y" + +++ + po(X) 


i=l 


where the p;(X) are in Q{a][X] and are integers over Z[X] because they are 
polynomials in r;(X). Then we can apply the previous lemma to each p;(X). 


9.4.5 Finding a primitive element 


The coefficients of P, generate an extension K of Q. We aim to get a primitive 
element of K which is an algebraic integer over Z. 

First we check if there is a primitive element among the coefficients of 
P,. If this is not the case, we present a method which constructs a primitive 
element. 
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Recognition 
We let P;(X,¥) = Soa" X“Y”. The fundamental lemma implies that 
uU VU 


Pela? VT) =[[}_.(T- alts”)), Furthermore, al) # an for alli Aj 
v) 


if and only if ak is a primitive element of K. Thus, 


al is a primitive element of K 


if and only if 


() u,v 
1 ap Penar(ay )y) =1. 


We derived the following characterization: 


gcd Paner a) 


Lemma 9.4.5. With the previous notations, we have: 


UsU 6) U,U 
ged(Penar (at ", aprenar (ay ’)) =1 
if and only if 


qth) is a primitive element of K. 


In this case Pehar(af”?) is the minimal polynomial f cu») of al 
1 


(u,v) 


Moreover a; is an algebraic integer over Z. 


v) 


over Q. 


This lemma allows us to recognize effectively a primitive element. 


Construction 


If all the coefficients of P; are not primitive, we construct with high probability 
a primitive element, which is integer over Z. By Lemma 9.4.5 we can check if 
this constructed element is primitive. 

We denote by o; (1 < i < s) the s independent Q-homomorphisms from 
K to C and by aft?) the coefficients of P,. We recall that they generate K. 

For any pair (7,7) such that 1 4 j, there exists a coefficient al") of P; 
such that oi(al"”) # a; (at). Thus the polynomial 

A(A(10); Banitny Pei iat) 
= T][(ei- 2s) (a) + ra,0)(01- 2) (@) + AQ.n-y (01-0) (a) 
i<j 

isa eee polynomial in C[\(;,;)]. So we can find (A(1,0),---,A(2n—1)) with 
Aci,j) € Z such that for Vi 4 7: 


0,0 2,n-1 
¢ ) ( )) 


oj(a + Auer Se eae A(2,;n—1) 44 


differs from 
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05 (a? + eC 


+...+ Leg ga, 
This means that al?) + Auer” +...4 Nancaga is a primitive 
element. We apply Lemma 9.2.15 (see exercise 9.2.14) to the polynomial 


A(A(,0); a) A(2,n-1)) € Clr\,3)] 
and get the following proposition: 


Proposition 9.4.6. Let P be a polynomial in Z[X,Y], which is monic and 
irreducible in Q(X, Y], and P = P,---P, its irreducible decomposition in 
C[X,Y]. Let af”) denote the coefficients of P,, and K the extension of Q 
they generate. 

: be a subset of Z. Then we have the following estimation of probability: 


Gal ae (2,n—1) 
() ) 
|S 


1,0) 4} OY te 4 +8 (2,n—1)44 non primitive | 8; € S,2<i<r) 
| 
2,n—1) 


The at ”) are integers over Z, so a” 9) + 8(1, aa” ape rece Sip at j 
is an algebraic integer over Z cad the probability that this element is non- 
primitive is less than “) . So we can make the probability as small as we want. 
Moreover as it is possible to check the result with Lemma 9.4.5 ; this provides 
a method (efficient) and easy to implement. 


P 
< 


Choice of the precision 


In practice we can only compute an approximation of a minimal polynomial 
fa(T), with fo4-(T) = Tp2itt = (an + €R)). 

We have perturbed roots and we want to know if the perturbation on the 
coefficients is smaller than 0.5 in order to recognize the polynomial f, from 
fate. The following map describes the situation: 


yp: C& —> Cs 
a Oi Gipsy Qs) = Oy + ag f<-+b g 
be pa ¥ Spl 04,404,053) = » Ai, 1 OG, 
1<i1 <...<inSs 
a 
s S(Q1,-.-,@s) = Qy X+++X Ws 
We define ||.|[o0 by ||(a1,---,@s)|]oo = Maxj=1,...,5 |a;|. We look for a condition 


on € which implies ||p(a@ + €) — y(a)||oo < 0.5. y is a polynomial map such 
that the degree of each component is less than or equal to s and is of degree 
1 in each variable. With the notation 
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the Taylor expansion of y is 


y(a + €) — 9(a) = 


We introduce the constants « and M such that 


e jai] <M forall <i<s 
e lel|<e<1. 


Lemma 9.4.7. With the previous notation, we have 
lly(a + €) — y(a)lloo < 3 ° klimax(1, max ( ae Mi-*)) |e. 
ee NA "G=k+1,...8 \j—k 


Proof. The total degree of the polynomial S; is 7, so we deduce 


© Ifk>j then ———? — 
Oa}... das 


© Ifk=j then —~ (a 
Oa}... das 


Moreover, we easily get the following upper bound, for k < j, 
k . — * 
se ae < ¢ a) Me. 
Oa}! ...0as j—k 
As a result we obtain 


Oy s—k “ob 
eee < oP), 
| Oay) ... das (| ~ pe Peat ( — a ) 


It follows that 
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areiets 


Since € < 1, we deduce the required result. 


Corollary 9.4.8. With the previous notations, if the error €« on the roots is 
bounded by 


Se (x (;) Mme, mx (5) wrsy) 


k=1 


-1 


then the error on the coefficient of fa+e is smaller than 0.5. 
So we have proven the following proposition. 


Proposition 9.4.9. We denote by Digits1 the number of significant digits 
used for the computation of the minimal polynomial. If Digits1 > 


a 


then we can recognize all the coefficients of fa(T) from the coefficients of 
fa+e(T). 


To give an idea of the size of Digits1, we provide the following table. 


log1o (0 ee (;)étmax(1,maxjanss,..a(G2)M-)) | 


+ logyo(max;=1,...,s((;) M*))) 


8| M |Digits1 s| M |Digits1 
2]10°] 16 10] 10°] 97 
2/107) 31 10/102] 192 
2/1029] 61 10/107°| 382 
5{105| 47 15/10°| 147 
5/10] 91 15/102] 292 
5/1029) 182 15/1029) 582 


9.4.6 A method to obtain the exact factorization 


We start with a polynomial f, of a primitive element a of K, obtained as 
explained in the last section. We will use another canonical representation of 
the coefficients of P;. 
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f’,(@) is a common denominator 


We recall some classical results of algebraic number theory. 


Definition 9.4.10. Let K be a finite extension of Q. Let M be a subset of K. 
We set M* = {x € K|Vy € M,Trx/a(xy) € Z} and call it the complementary 
set of M. 


Proposition 9.4.11. (see /Rib01, page 242]) Let K be a finite extension of 
Q, a € Ox a primitive element of K and fq its minimal polynomial. Then we 


1 
have Ox C Zia]* = Fila) oo! This implies that alla € Ox can be written 


in the following way: 
20 al Zs—1 


“= Fla) fila) Fila) 


as! with z; € Z. 


Recognition of the coefficients of P, 


Having the denominator f(a), we only have to recognize the numerators. Let 


ahh) be a coefficient of P;,, so alt) belongs to Ox. We have 
att?) = 20 ; 21 eRe Zs—1 gee) 

? file) F4(2) fi (@) , 
Applying the Q-homomorphism a;, we get 


CD) = ZO Z1 Pe caer P| an oo 7 
= Feta * FeMtotay + Frfoiayy o> he 
Lor(a) or(0)? ++ oa(a)1\ f 2% fa(or(a))ay 
1 d2(a) a2(a)? es 62(a)>-1 Zz fi(o2(a))a 


" ae : : : : = 
1 o6(a) o6(a)? ++» o5(a)>? %s-1 f.(oa(a))aS” 


We remark that in practice we do not have ah) but al”) +v;, and we do not 
have o;(a@) but o;(a@) + €;. So we need to solve the Vandermonde system and 
take the nearest integer of each component of the solution. Now we explain 
how to certify the result. 


Choice of the precision 


First we set some notation: Mi,.n(C) is the ring of matrices with m rows 


and n columns, with coefficients in C. If M = ia) 50 is a matrix of 
s—l1 
Meg,5(C), let ||M||o = max s; |mij|. If v is a vector of C* (with i-th 
: i=0,...,s-1¢ PA 
j= 
coordinate equal to v;), then ||v|lo = | max |u;|. With this notation we 
4=050,8= 


have ||Mv|loo < ||M|loo||v||o0. Now we set: a; = o;(a), €&; is the error on aj, 
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vy; is the error on aft”), e; is the error on z;, € is a real number such that 
Vi<i<s|e|<e<l 
Vi <i<s|yil<e<1?’ 


M is a real number such that: max laf”? <M, 
1,U,v 


_, (fila)... O 1... 0 
1a a?--- aft : 4 
1 ag az +++ af! ae 
M(a) = a G sahan @ 1; 
1 oy ates oe > ; 
0 0 ... fi (as) 
Z0 (sv) (oa €0 Vo 
Z= : pals) = : ,e=]: |,e= ,v= 
Zs—1 gh) Es Es—1 Vs—1 


Then we have the equality z +e = M(a+ 6)(a™”) + v). 

Now, we are going to give an expression for the coefficients of M(a) as a 
function of a;. We will deduce that M(a +e) = M(a)+«N, where N is a 
matrix with bounded coefficients and hence get the bound 


llelloo < (]44(@)|loo + || Nl]oo(1 + M)) €. 
Expression of the coefficients of M(a) and M(a+e) 
Lemma 9.4.12. Let M(a) = (mi,3(a)) 2520 then we have: 
mi,;(@) => (—1)?-*18,4-1(01, sey a5, QAj+25 Pare , Qs). 


Proof. We denote by V(a)~! = (wi, (a) eer the inverse of the Vandermonde 
matrix. 


s—l 
The value of the polynomial J,(a) = b> We? is 1 when x = ag 41 and it 
j=0 
is 0 when a € {ay,...,as} \ {x41}. Hence I, (x) is the Legendre polynomial 
and we get 
- XL — Qj - 1 
I, (a) = (=) = x — ay) X : 
= TT amas) = We * gre 
ixk+1 ifk+1 
—1)s-1-J$,_,_ Op ng Ole 
Therefore w;,,(a) = oy J (1, » Oks O42) +++») where S, is 
Figs) 


the symmetric polynomial (see Section 9.4.5), and we set So = 1. The defini- 
tion of M(a) gives m;,;(a@) = w;,;(a) f,(aj41). Thus the claim is true. 
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Corollary 9.4.13. There exists a matrit N € Mg,.(C) such that 


M(a+e) = M(a)+eN 


s-l 
—1 =e: ae 
with ||N|loo <8 (* ) Himax(, max (? ) mi’) ; 
rai ae jaktl,..s-1'\ j—k 


Proof. Apply Lemma 9.4.7. 


Upper bound for |lel|. 
In the last paragraph we showed that M(a+e) = M(a)+€N. So, we 
deduce the following. 


Lemma 9.4.14. With the previous notations, we have 
llelloo < ([]4£(@)|loo + ||Nloo(1 + M))e. 


Proof. The equality z +e = M(a+e)(a+v) becomes z+e = (M(a) + 
eN)(a+v). Then, e= M(a)v +eNa+eNv. We deduce that: 
llelloo S$ |M4(@)|loo€ + IN llooe* + || Nallooe < (|| (2) [hoo +N lloo + ||-N llooM)e. 


Conclusion 
The results of the previous parts lead to the following. 


Proposition 9.4.15. If the error € is such that 


1 
€<0.5 (, max (s ( _ )uenes) + 
i=0 1 \s—-i-1 


(x Cy ‘ Bn sae ea (° j k ‘) ws) (1+ »)) : 


k=1 


then, with the system (x) (see Section 9.4.6), we can recognize the exact coef- 
ficients of P,. 


Proof. Lemma 9.4.12 gives 
[tag )| S Soa illea|y x05 [Oyly ej qal,s+- 5 [@el): 


So |mjj(a)| < yD Men es ( a Jae, It fol- 


~ \s-4-—1 
1<ki<..<ks—i-1<s—-1 


lows that 
s—1 s—l1 
s—l . s—l ; 
ae < Ms-?-1 — M3771. 
Sims s do (,27 2 )aet=s(°71)) 
j=0 j=0 
s—l 


=O; i538 1 1 
lary 9.4.13 this implies 


) s~*~1). Together with Corol- 
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—1 : 
llelloo < (,_ na (s (, 8 :) Msi) 


+s (= & N) kbmax(1, ma ie ; A *) wy) (1+ »)) e 


So we get the stated bound. 


Proposition 9.4.16. We denote by Digits2 the number of significant digits 
used for the step of recognition of the exact coefficients of P,. If 


seiseg 


Digits2 > E ( 


s—l 71 
+s ( Criklmax(1, max (aM) (1+ ») 
k=1 


=k+1...., 


eee 


+ logig (, I 


then we can recognize the coefficients of P, from the solution of the system 


(x). 


In order to give an idea of the size of Digits2 we provide the following 
tables. 


s| M |Digits2 s| M |Digits2 
710°) 16 10] 10°] 98 
ato! 31 10/107] 193 
10-1 61 10/1079| “383 
5[10°| 48 15] 10°] 149 
5/10] 93 15/101] 294 
5|10"|" 183 15/1079 584 


9.4.7 Conversion 


Let G € Ox. We have the following two representations: 
s—l1 s—1 


p= S- Fay = > q;a’ where zj7€ZandqgecQ 
j=0 7o i=0 


s—1 
Let B(a) be the inverse of f(a), and set a’ B(a) = yh where 
i=0 
b;,; € Q. It can be easily computed (once for all coefficients of P;). 


Lemma 9.4.17. With the previous notations and with 


9 Absolute Factorization 391 


q= ; o a= , and Mg= (bi3)2 50 € M,.5(Q), we 


9.4.8 The algorithm 
Algorithm 9.4.18 THE CHEZE-GALLIGO ALGORITHM 


Input: P € Z[X,Y] irreducible in Q(X, Y], monic in Y. 


1. Compute an approximate absolute factorization of P, with a number of 
significant digits = Digits. 

Compute Digits! and Digits2. If max(Digits1, Digits2) > Digits then 
go to step 1 with Digits = max(Digits1, Digits2). Else go to step 2. 

2. Recognize all the primitive coefficients of Py and their minimal polyno- 
mial. [If no coefficients are primitives then construct a primitive element. | 
Choose a primitive element. We denote fa its minimal polynomial. 

3. Recognize the exact coefficients of P, by solving a Vandermonde system. 
Give for each coefficient of P; its canonical expression in Qla]. 


Output: The minimal polynomial of a primitive element of K and 
Pi(X,Y) € K[X,Y], an absolute factor of P. 


9.4.9 Description of the algorithm 


Input: P(X,Y) = Y* + 2Y?X + 14Y? — 7X? 4 6X +47. 


Step 1) 
Apply an approximate absolute polynomial factorization to P with Digits = 4, 
and get 

P,(X,Y) = Y? + 3.828X + 8.414, 

P,(X,Y) = Y? — 1.828X + 5.585. 


We have s = 2 and we can take M = 10 (in fact we have to choose M > 8.414) 
Digits! = 4, Digits2 = 4. 


Step 2) 
As before, we get 
Ff (0:0) = T? —14T + 47, and Disc(f,,.0.0)) = 8, 
all 1 


f¢.0) = T? — 2t—7, and Disc(f,a.0)) = 32. 
- 1 
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a = a\”) is a primitive element of K, and f(T) = T? — 147 +47. 


1 8.414 Zo \ 2.828 X 3.828 
15.585) \ a) ~ \ —2.830 x (-1.828) } ’ 
This gives 2 = —5.989 and z; = 1.998. 
—6 Qa 
So zy = —6, 21 = 2 and afb = a 
EE ERE Ray FeO 
We have fa(T) = T? — 147 +47, f(T) =2T —14 and 
1 
~=f,(T) + fi(T)(<T — —) = 1. This implies -T — — = fi (a)71. 
5 Jal y+ fi( MG re 1. This implies rm Z fi (a) 
Thus a = = + 2 es 
? fala) fa (a) 


Output: fo(T) = T? — 14T +47, Pi(X,Y) = Y? + (-13 + 2a)X +a. 


Step 3) 


13 + 2a. 
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reduced, 70, 71 
Griffis-Duffy platform, 337 


Harris’ lemma, 367, 368 

Hefer expansion, 31 

Hensel lifting, 348, 344, 346, 370, 
375 
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Hensel’s lemma, 345 
Hermite predictor, 309 
Hermite’s theorem, 140 
Hermite, C., 140 
hierarchy of homotopies, 313 
Hilbert scheme, 78 
Hilbert’s irreducibility theorem, 344 
Hilbert’s Nullstellensatz, 65, 67, 134, 
243, 249, 251, 252 
effective, 243, 249, 252-254 
Hodge structure, 41 
homogenization, 18, 307 
homotopy, 303, 358, 360 
polyhedral, 319 
solver, 127, 303, 319 
Horner polynomials, 10 
hypergeometric functions, 59 


ideal 
complete intersection, 31, 76, 96, 
148 
embedded prime, 108, 206 
equidimensional, 227 
homogeneous, 40 
Jacobian, 41 
maximal, 72-74, 84-86, 107, 108, 
120 
minimal prime, 205 
primary, 72, 86, 100, 107, 205 
primary decomposition, 32 
prime, 107, 111, 205 
pseudo-primary, 238 
radical of, 72, 75, 107, 111, 205 
strict complete intersection, 37 
zero-dimensional, 31, 107, 109— 
111, 113, 227 
ideal membership, 11, 96 
ideal of points, 90 
ideal quotient, 206 
idempotent, 133 
implicit function theorem, 100 
implicitization, 158, 160, 289, 291— 
293, 295 
algorithm, 161 
curve, 22 
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inner normal, 315 

interpolating polynomial, 11, 32 
interpolation, 10 

intersection pairing, 41 

inverse problem, 196 

inverse system, 132 

isolated point, 154 


Jacobian, 28, 32 
Jacobian matrix, 305 
Jordan canonical form, 71 


Koszul complex, 57 
Koszul sequence, 41 
Kronecker symbol, 31 
local, 32 
Kronecker, L., 100, 102, 104, 115— 
117 
Kruppa equation, 163 


Lagrange polynomials, 133 
Lagrange, J.L., 122 

lattice tetrahedra, 14 

Leibniz formula, 82, 84 

lifting, 273, 293, 319 

linear trace, 334 

linear-product start system, 312, 
313 

linear-programming model, 322 
local algebra, 133 

local Markov property, 215 

local ring, 73, 77, 79, 81, 83-85 
lower hull, 274, 281, 293, 314, 319 


Macaulay formula, 280, 285 

Macaulay, F.S., 48, 100, 269, 275, 
279, 285 

marked polynomial, 182 

maximal independent set, 223 

maximal minor, 143, 147, 155 

mechanical design, 303, 337 

membership test, 329 

minimal polynomial, 71, 75 

Minkowski sum, 53, 271, 272, 274, 

278, 280, 283, 284, 288, 293, 

296, 318 


Minkowski’s theorem, 318 
missing variables, 70, 71, 94 
mixed cell, 273, 274, 280, 293, 322 
mixed subdivision, 273, 274, 280, 
281, 283, 293-295, 319 
mixed volume, 271-275, 280, 283, 
287, 295, 319 
stable, 273 
molecular conformation, 166 
monodromy, 333, 360, 364, 367— 
369, 376 
multi-projective space, 312 
multiplication map, 13, 64, 66, 77, 
79, 94, 133, 143, 151 
characteristic polynomial of, 68, 
75-77, 102, 103, 109, 111, 112, 
116, 117, 119 
dual of, 79, 80 
eigenspaces of, 75 
eigenvalues of, 66, 67 
eigenvectors of, 69-71 
endomorphism, 172 
generalized eigenspaces of, 77 
minimal polynomial of, 68, 69, 


75, 110, 113 
non-derogatory, 69-76, 85, 106, 
116, 122 
multiplication matrix, 67, 79, 80, 
94, 95 


algorithm for, 68 

formal, 192 

multiplicity, 133 

algebraic, 78 

geometric, 78 

Hilbert-Samuel, 78 

of a solution, 73, 77, 97, 100, 109 
multivariate division, 128 
multivariate factorization, 95 


Nakayama’s lemma, 86 

Newton identities, 13, 44, 145, 160 

Newton polygon, 350 

Newton polytope, 271-274, 278, 280, 
283, 288, 292, 293, 314, 350 

Newton sums, 13, 144, 160 


Newton’s method, 315 

Newton, I., 271, 272 

Newton-Hensel method, 263, 265, 
266 

Newton-Hensel operator, 264, 266 

non-derogatory matrix, 69, 71, 93 

normal cone, 321 

normal fan, 321 

numerical algebraic geometry, 302 

numerical irreducible decomposition, 
332 

numerical stability, 78, 130 


optimal homotopy, 303 
order ideal, 179, 197 
corners, 188 
ordering 
block, 220 
elimination, 220 
graded lexicographic, 128-130 
graded reverse lexicographic, 208 
lexicographic, 128 
monomial, 128, 130 
product, 220 
overdetermined system, 156, 328 


parallel robot, 164, 165, 270, 337 
partial fraction decomposition, 12 
Pascal’s mystic hexagon, 24, 39 
path following, 309 
polynomial 
(total) length, 257, 258 
precision (choice of), 384, 387, 390 
predictor, 309 
prenex formula, 253 
primary component, 83 
primary decomposition, 72, 77, 79, 
83, 86, 87, 100, 107, 108, 110— 
113, 120-122, 203, 205 
GTZ algorithm, 230 
minimal, 107 
Shimoyama-Yokoyama algorithm, 
238 
primary decomposition algorithms 
GTZ, 230 
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PDsplit, 229 
PD, 230 
equidimensionalPD, 228 
flattener, 228 
independentSet, 228 
saturation, 228 
prime avoidance theorem, 107 
primitive element, 261, 382-384, 387, 
391 
probability simplex, 212 
projection map, 220, 221 
projection operator, 299 
projective space, 19, 96, 307, 311 
pseudo-division, 222 
Puiseux series, 316, 324 


quadratic convergence, 303, 314— 
316 

quadratic form, 140 

quantifier elimination, 245, 246, 252— 
254, 263 

effective, 245, 246 

quotient, 128 

quotient algebra, 63, 65, 78, 94, 97, 
100-102, 104, 106, 108, 114, 
131 


random constant gamma, 306 
rational factorization, 344 
rational hypergeometric functions, 
61 
rational over a field, 108-110 
rational univariate representation, 
136, 139, 155, 156 
regular sequence, 31, 96 
regular subdivision, 319 
regular triangulation, 320 
remainder, 128, 129 
normal, 185, 190 
remainder monomials, 66-68, 71, 
86, 87, 89-91, 94 
residual resultant, 148, 149 
residue, 141, 148, 144 
at infinity, 5 
computation of global residues 
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Bezoutian methods, 34 
multivariable case, 33 
normal form methods, 36, 42 
univariate case, 8 
dependence on coefficients, 6 
elimination, 43 
global duality, 8, 28, 80, 96 
global multivariate, 28, 96 
global to local, 8, 32 
global univariate, 5, 10, 80 
Grothendieck, 27 
homogeneous, 40, 60 
iterated, 34 
local duality, 28 
local univariate, 3 
multidimensional 
algebraic definitions, 30 
definition via Bezoutian, 30 
geometric definition, 29 
integral definition, 27 
Kronecker symbol, 31 
projective, 40 
toric, 8 
transformation law 
generalized, 142, 144 
global, 29, 33, 42, 142 
local, 29, 33 
resultant, 145, 167, 275-277, 279, 
289 
as determinant of a complex, 57 
Bezoutian matrix, 21, 48 
determinantal formulas, 20, 53 
history, 275 
homogeneous, 44 
homogeneous Poisson formula, 47 
Macaulay formulas, 48, 49, 98 
matrix, 279, 290, 298 
multihomogeneous, 52, 59 
multivariate, 44, 96, 97, 99, 119 
normal form algorithm, 57 
Poisson formula, 19 
projective, 145, 154 
residual, 54, 148, 149, 165 
sparse, 53 
Sylvester matrix, 18, 19, 48 


toric, 53, 145, 147, 276-279, 281, 
285, 289, 292, 293 
unmixed systems, 50 
Richardson extrapolation, 325 
Riemann sphere, 19 
Riemann surface, 332 
robotics, 156, 270, 291, 337 
root count, 24, 25, 140, 159, 272, 
312, 323 


saturation, 207 
secant predictor, 309 
signal processing, 156, 167 
signature, 140 
single-variable representation, 75, 
76, 102, 105 
slack variable, 328, 330, 331 
software 
MULTIRES, 125, 143, 162 
SYNAPS, 126, 131, 163, 166 
CoCoA, 200, 203 
Macaulay 2, 203, 205, 221, 222 
Maple, 306, 315, 332 
PHCpack, 302, 303, 336 
Singular, 34, 37, 44, 203 
solution 
multiple, 136 
real, 139 
simple, 135 
solution at oo, 96, 97, 99 
splitting algebra, 114-116, 118-122 
universal, 115 
splitting field, 116, 118, 120 
splitting polynomial, 209, 220, 223 
splitting principles, 209 
squarefree decomposition, 75 
start system, 303, 305, 321 
statistical model, 212 
step size control, 309 
straight-line program, 256, 258, 260, 
263, 266, 267 
(total) length, 256-258, 266 
additive length, 257 
division-free, 256 
non-scalar, 258 


non-scalar length, 257, 258 
subresultants, 58 
support of polynomial, 271, 279, 
290-296, 317, 350 
Sylvester matrix, 145, 277, 281 
Sylvester, J.J., 277, 279, 298 
Sylvester-type matrix, 92, 279, 298 
symmetric 
functions, 361, 372, 373 
group, 116, 117, 119, 120, 123, 
360, 366 


target system, 303, 325 
theorem of the primitive element, 
105, 106 
toric variety, 40, 54 
total degree, 311 
trace, 30, 33, 44, 139 
computation and Bezoutian, 30 
computation using residues, 13, 
30 
transposition, 364-366, 373 
triangular set, 238 
triangulation, 317 
twisted cubic, 51 


van Kampen’s theorem, 369 
Vandermonde system, 341, 387 
variety 

algebraic, 242, 244, 245, 253, 254, 

260, 262 

dimension of an algebraic, 242 

irreducible algebraic, 244 
Veronese map, 51 


well-parallelizable, 248, 253 
Wilkinson polynomial, 26 
winding number, 324 
witness set, 327 


Zariski topology, 244, 245, 260 
zero sum, 370, 375, 377 


